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57) A method of increasing the accumulation of 
squalene and specific sterols in yeast compris- 
ing increasing the expression level of a struc- 
tural gene encoding a polypeptide having 
HMG-CoA reductase activity in a mutant yeast 
having single or double defects in the expres- 
sion of sterol biosyntheticenzymes is provided. 
The expression level of a structural gene is 
preferably increased by transforming yeast with 
a recombinant DNA molecule comprising a vec- 
tor operativeJy linked to an exogenous DNA 
segment that encodes a polypeptide having 
HMG-CoA reductase activity and a promoter 
that is suitable for drMng the expression of the 
encoded polypeptide in the transformed yeast 
The polypeptide having HMG-CoA reductase 
activity is preferably a truncated, active HMG- 
CoA reductase enzyme. Recombinant DNA 
molecules useful for transforming yeast and 
mutant yeast transformed with such recombin- 
ant DNA molecules are also disclosed. 
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Technical Reld 

The present invention relates to a method and composition for increasing th accumulation f squal ne 
and sp cine sterols in y ast Squalen and sterol accumulation is increased by increasing the expressi n level 
5 of a gene encoding a polypeptide having HMG-CoA reductase activity. 

Background of the Invention 

As used herein, the term "sterol" refers to derivatives of a fused, reduced ring system, cydopenta-[a}-phe- 
10 nanthrene, comprising three fused cyciohexane rings (A, B and C) In a phenanthrene arrangement and a ter- 
minal cyclopentane ring (D) having the formula and carbon atom position numbering shown below: 



R 



18 




where R is an 8 to 10 carbon-atom sidechain. 

Sterols are metabofically derived from acetate. Acetyl coenzyme A (CoA) reacts with acetoacetyi CoA to 
form 3-hydroxy-3-methyiglutaryi CoA (HMG-CoA). HMG-CoA is reduced to mevalonate in an irreversible reac- 
25 tion catalyzed by the enzyme HMG-CoA reductase. Mevalonate Is phosphoryfated and decarboxytated to iso- 
pentenyl-pyropho8phatB (IPP). Through the sequential steps of isomerization, condensation and 
dehydrogenabon, IPP is converted to geranyl pyrophosphate (GPP). GPP combines with IPP to form famesyl 
pyrophosphate (FPP), two molecules of which are reductively condensed to form squalen©, a 30-carbon pre- 
cursor of sterols. 

30 In yeast, squalene is converted to squalene epoxide, which is then cyclized to form lanosterol. Lanosterol 
has two methyl groups at position 4, a methyl group at position 14, a double bond at position 8(9) and an 8 
carbon sidechain of the formula: 

CHaCHtCHjJjCH^CHa)* 

Lanosterol is sequentially demethylated at positions 14 and 4 to form zymosterol (choIesta-8,24-dienol), 
35 which is converted to ergostorol (ergosta-5,7,22-trienoI), the most abundant sterol of naturally occurring, wild- 
type yeast via a series of five enzymatic reactions schematically diagramed in Figure 1 . 

The five reactions are: 

a. methylatton of the carbon at position 24, catalyzed by a 24-methyitransf erase; 

b. movement of the double bond at position 8(9) to position 7(8), catalyzed by a A8-A7 Isomerase; 
40 c. introduction of a double bond at position 5(6), catalyzed by a 5-dehydrogenase (desaturase); 

d. introduction of a double bond at position 22(23), catalyzed by a 22-dehydrogenase (desaturase); and 

e. removal of a double bond at position 24(28), catalyzed by a 24(28)- hydrogenase (reductase). 

In wad-type yeast of the species Saccharomyces cerevisiae (S. cerevisiae) . the predominant order of these 

reactions is thought to be a, b, c, d and e. parks et al. t CRC Critical Reviews In Microbiology , 6:301 -341 (1 978)]. 
45 According to such a predominant pathway, zymosterol is converted sequentially to fecosterol [ergosta- 

8,24(28)-dienol], episterd [ergosta-7,24(28)-dienoq t ergosta-5,7,24(28)-trienol, ergosta-5,7,22, 24(28)-tet- 

raenol, and finally ergosterol. 

If the enzymes catalyzing the reactions involved in the predominant pathway are substrate specific, then 

one would expect to find only the six sterols set forth above in yeast Such, however, is not the case. Eighteen 
60 sterols have been found and described. [See, e.g., Parks et a)., CRC Critical Reviews in Microbiology, 6:301- 

341 (1978); Woods et al., Microbes. 1(KA): 73^80 (1974); Sard et ai„ Lipids. 12:645-654 (1977) (See Table 1)]. 

Thus, at least some of the enzymes are not substrate specific. 
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Table 1 



Required* 



1. 



2, 



3. 



4. 



5. 



7, 



Zymosterol (cholesta- 
8, 2 4 -dienol) 

fecosterol (ergosta- 
8,24(28) -dienol) 

episterol ( ergosta- 
7, 24 (28) -dienol) 

ergosta-5, 7, 24 (28)- 
trienol 

ergosta-5, 7, 22, 
24(28) -tetraenol 

ergosterol (ergosta- 
5,7,22-trienol) 

ergosta-7,22,24 
(28) -trienol 



none 



a,b 



a,b,c 



a, b , c, d 



a,b, c, d, e 



a,b,d 



8. 

9. 

10. 

11. 
12. 
13. 
14. 
15. 

16. 
17. 
18. 



cholesta-7, 24- 
dienol 

cholesta-5 , 1 , 24 - 
trienol 

cholesta* 5,7,22,24- 

tetraenol 

ergosta-5 , 7 -dienol 

ergosta-7 , 2 2 -dienol 

ergosta-7-enol 

ergosta-5 , 8 -dienol 

ergosta-5 ,8,22- 
trienol 

ergosta-8 , 2 2 -dienol 
ergosta-8-enol 

ergosta-8 ,14,24(28)* 
trienol 



b,c 

b,c,d 

a , b, c, e 
a,b, d,e 
a,b,e 
a,c,e 
a, c, d, e 

a , d, e 

a,e 

a 



* Enzymes theoretically required for the synthesis of the 
designated sterol. 



Despite the lack of substrate specificity, one might expect that specific alterations in th sterol biosynthetic 
pathway would have predictabl consequences. Currently available data show that such predictability is not 
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pres nL 

For example, mutant S. cerevblae with a defect in the expression of zyrnosterol-24-methyt-transferase 
(ertzym a), which mutants are designated erg6, might be expected to accumulate sterols 1 and 8-10 of TabI 
1 , which sterols theoretically do not require the action of enzyme a for their synthesis. Parks et a)., CRC Critical 

s Reviews in Microbiology. 6:301-341 (1978), however, report that erg6 mutants accumulate only zymosterol 
(#1), cholesta-5,7 r 24-trienol (#9) and cholesta-5,7,22,24-tetraenol (#10). Bard, M. et ah, Lipids , 12:645-654 
(1 977), on the other hand, report that erg6 mutants accumulate only sterols #1 and #10. 

Mutant S. cerevisiae with a defect in the expression of ergosta-5,7,24(28)-trienol-22-dehydrogenase 
(enzyme d), designated erg5, might be expected to accumulate sterols 1 -4, 6, 8, 9 ( 1 1 , 1 3, 14, 1 7 and 1 8. Parks 

10 et al., CRC Critical Reviews In Microbiology. 6:301-341 (1 978) report, that erg5 mutants accumulate only ergos- 
ta-5,7-dienoi (#11), ergosta-5,7,24(28Hrienoi (#4), ergosta-8,14,24(28)-trienol (#18) and episteroi (#3). In con- 
trast Bard et al., Lipids. 12:645-654 (1977) report that erg5 mutants accumulate zymosterol (#1), 
ergosta-5,7-dienoJ (#11), ergosta-5,7,24(28)-trienol (#4), ergosta-7^24(28)-dienol (#3) and ergosta- 
8,14,24(28)-trienol (#18). 

15 StOI further, mutant S. cerevisiae with a defect in episterol-S-dehydrogenase (enzyme c), designated erg3, 
might be expected to accumulate sterols 1-3, 7, 8, 12, 1 3 and 16-18. Parks et al., CRC Critical Reviews in Micro- 
biology, 6:301-341 (1978) report that erg3 mutants accumulate only ergosta-7,22-dienol (#12), ergosta-8,22- 
dienol (#16), ergosta-7,22,24(28)-trIenol (#7), fecosteroJ (#2) and episteroi (#3). 

These data, taken together, show that specific defects in the expression of one sterol synthetic enzyme 

20 do not lead to predictable changes in sterol accumulation. A similar degree of unpredictability is found when 
sterol accumulation is examined In mutants having two defects In enzymes of the sterol biosynthetic pathway. 

Thus, for example, erg5-erg6 double mutants (defects In enzymes d and a) might be expected to accumu- 
late sterols 1, 8 and 9. Parks et al. and Bard et al., above, report that erg5-erg6 double mutants accumulate 
only zymosterol (#1) and cholesta~5,7,24-trienoJ (#9). 

25 These data relating to sterol accumulation in yeast show that specific alterations in enzyme activity do not 
result in predictable changes in sterol accumulation. The data further show a lack of agreement between dif- 
ferent investigators studying identical alterations. The present invention furnishes a solution to the problem of 
unpredictability by providing a method and composition for increasing the accumulation of squalene and speci- 
fic sterols In yeast 

30 

Summary of the Invention 

The present Invention generally provides a method of increasing squalene and specific sterol accumulation 
in mutant yeasts having a single or double defect in the expression of sterol biosynthetic pathway enzymes. 

35 This method comprises transforming such mutant yeasts with a recombinant DNA molecule comprising a vector 
operatively linked to an exogenous DNA segment that encodes a polypeptide having HMG-CoA reductase 
activity and a promoter suitable for driving the expression of HMG-CoA reductase In the transformed yeast 

The structural gene encoding a polypeptide having HMG-CoA reductase activity preferably encodes an 
active, truncated HMG-CoA reductase enzyme, which enzyme comprises the catalytic and at least a portion 

40 of the linker region that is free from the membrane binding region of HMG-CoA reductase enzyme. The copy 
number of the structural gene Is Increased by transforming a mutant yeast with a recombinant DNA molecule 
comprising a vector operatively linked to an exogenous DNA segment that encodes a polypeptide having a 
HMG-CoA reductase activity and a promoter that is suitable for driving the expression of the encoded polypep- 
tide in the transformed yeast 

45 Suitable promoters include promoters that are subject to inducible regulation by factors either extrinsic or 
intrinsic to yeast Preferably, both the promoter and the exogenous DNA segment are integrated into the 
chromosomal DNA of the transformed yeast 

The present invention most preferably provides a method of Increasing squalene, zymosterol, cholesta- 
7,24-dlenol and cholesta-5,7,24-trienol accumulation In yeast of the species S. cerevtelae comprising Increas- 

so ing the expression level of a structural gene encoding a polypeptide having HMG-CoA reductase activity in a 
mutant S. cerevisiae having defects in the expression of zyirosterol-24-rnethyltransf erase (erg6) and ergosta- 
5,7,24(28)-trienol-22-dehydrogenase (erg5). 

In further preferred embodiments, transformation of a mutant yeast having a defect In th express! n of 
the enzyme episterol-5-dehydrog nase (erg 3) results in a transformed, mutant yeast which veraccumulates 

55 squalene, ergosta-8,22-dienol, ergosta-7,22-dienoJ, ergosta-8-enol and ergosta-7-enol. Transformation of a 
mutant yeast having a double defect In the expression of zymosterol-24- methyl transferase and eplsterol-5- 
d hydrogenas enzymes (erg6 and erg 3) results in a transformed mutant yeast which overaccumulates 
squalene, zymosterol and choiesta-7,24-diertol. Transformation of a mutant yeast having a defect in the exp- 
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ression of ergo3ta^J^4(28)-trieno^22-dehydrogena8« (erg 5) results in a transformed mutant yeast which 
overaccumulatea zymosterol and a mixture of ergosta-5,7,24(28)-trienol and ergosta-5,7-dIenol. 

Transformation of mutant yeast is preferably accomplished using a recombinant DNA mol cut selected 
from th group of plasmid v ctors consisting of plasmids pSOC725ARC, pSOC106ARC, pARC306E, 
5 DARC300D, DARC300S, DARC300T and DARC304S. Most preferred is plasmid pARC304S. 

The present Invention further provides for a mutant species of S. cerevisiae. which mutant has a double 
defect in the expression of zymo8terd-24-ffnemyMransfera8e and ergosta-5.7,24(28)-trienol-22-dehydroger>- 
ase enzymes (erg 5 and erg6). That mutant S. cerevisiae is designated ATC0402mu. 

The present invention still further provides for a mutant species of S. cerevisiae having a single or double 
10 defect in the expression of enzymes that catalyze the conversion of squalene to ergosterol that la transformed 
with a recombinant DNA molecule comprising as described before. 

The present invention still further provides for recombinant DNA molecules used to transform mutant yeasts 
such that the transformed mutant yeast overaccumulates squalene and specific sterols. Preferred recombinant 
DNA molecules are plasmids DARC304S, pARC300S, pARC300T, pARC300D, OARC306E, pSOCIOGARC 
15 and pSOC725ARC. 

The present invention provides several benefits and advantages. 

One advantage of the present invention is the provision of methods known to result in the predictable 
accumulation of specific sterols. 

Another advantage of the present invention is the ability to accumulate specific sterols to levels markedly 
20 greater than levels found in non-transformed yeast 

Still further benefits and advantages will be apparent to the skilled worker from the description that follows. 

Brief Descriptions of the Drawings 

25 Figure 1 1s a schematic diagram illustrating the various transformation steps involved in the metabol ic con- 
version of zymosterol to ergosterol as shown and discussed in Bard et al., Lipids. 12(8) :645 (1 977). The letters 
(a-e) indicate the five enzymes responsible for catalyzing the individual transformation steps. Numerals alone 
or with the letter "C* and an enzymic name indicate the position of the enzymes' activities and the activity of 
each enzyme. 

30 Figure 2, shown as twelve panels designated Figure 2-1 through 2-12, is the nucleotide base sequence 
(SEQ ID NO:1) and derived amino acid residue sequence (SEQ ID NO:2) for S. cerevisiae HMG-CoA reductase 

1 published by Basson et at., Mol. Cell Biol. 8(91: 3797-3808 (1988). Nucleotides are numbered (left-hand side) 
In the 5' to 3' direction. Position 1 corresponds to the first nucleotide of the ATG triplet coding for the Initiator 
methionine. The predicted amino acid sequence is shown below the nucleotide sequence. The amino acid resi- 

35 dues are numbered (right-hand side) beginning with the initiator methionine. 

Figure 3 is a schematic diagram showing the physical structure and genetic organization of plasmid 
pSOC725ARC. Plasmid pSOC725ARC was constructed to place a coding sequence for a truncated HMG-CoA 
reductase gene under control of a GAL 1-10 promoter. This plasmid also contains the TRP-1 gene and the yeast 

2 micron origin of replication. Certain restriction sites indicated by lines linked to the arcs and abbreviation for 
40 their respective restriction endonudease enzymes are indicated. 

Figure 4 Is a schematic diagram showing the physical structure and genetic organization of plasmid 
pSOC106ARC. Plasmid pSOC106ARC was constructed to place a coding sequence for an intact HMG-CoA 
reductase gene under the control of a GAL 1-10 promoter. Plasmid pSOCIOGARC also contains the TRP-1 
gene and the yeast 2micron origin of replication. Certain restriction sites are Indicated as in Figure 3. 
45 Figure 5 is a schematic diagram showing the physical structure and genetic organization of plasmid 
pARC306E. Plasmid pARC306E was constructed to place a coding sequence for a truncated HMG-CoA reduc- 
tase gene under control of a GAL-1 promoter. Plasmid pARC306E also contains the TRP-1 gene. Certain res- 
triction sites are indicated as In Figure 3. 

Figure 6 is schematic diagram showing the physical structure and genetic organization of plasmid 
so pARC300D. Plasmid pARC300D was constructed to place a coding sequence for a truncated HMG-CoA reduc- 
tase gene under the control of a PGK promoter. Plasmid pARc300D also contains the TRP-1 gene. Certain 
restriction sites are indicated as In Figure 3. 

Figure 7 is a schematic diagram showing the physical structure and genetic organization of plasmid 
PARC300S. Plasmid pARC300S was constructed to place a coding s quence for a truncated HMG-coA reduc- 
55 tas gene under control of a PGK promoter. Plasmid pARC300S also contains a LIRA 3 selectable marker. Cer- 
taln restriction sites are Indicated as In Figure 3. 

Figure 8 is a schematic diagram. showing th physical structure and genetic organization of plasmid 
PARC300T. Plasmid pARC300T was constructed to place a coding sequence for a truncated HMG-coA reduc- 
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tase gene under control of a PGK promoter. Plasmid pARC300T also contain a URA3 s lectable marker. Cer- 
tain restriction sites are indicated as In Figure 3. 

Figure 9 is a schematic diagram showing the physical structure and genetic organization of plasmid 
pARC304S. Plasmid pARC304S was constructed to place a coding sequenc of a truncated HMG-CoA reduc- 
5 tase gen under the control of an ADH promoter. Plasmid pARC304S also contains a URA3 selectable marker. 
Certain restriction sites are Indicated as In Figure 3. 

DetaDed Description of the Invention 

10 I. Definitions 

The following words and phrases have the meanings sot forth below. 

Expression: The combination of intracellular processes, including transcription and translation, undergone 
by a structural gene to produce a polypeptide. 
is Expression vector A DNA sequence that forms control elements that regulate expression of structural 
genes when operatively linked to those genes. 

Ope ratine ly linked: A structural gene Is covalentfy bonded In correct reading frame to another DNA (or 
RNA as appropriate) segment such as to an expression vector so that the structural gene Is under the control 
of the expression vector. 

20 Promoter A recognition site on a DNA sequence or group of DNA sequences that provide an expression 
control element for a structural gene and to which RNA polymerase specifically binds and Initiates RNA synth- 
esis (transcription) of that gene. 

Recombinant DNA molecule: A hybrid DNA sequence comprising at least two nucleotide sequences not 
normally found together in nature. 

25 Structural gene: A DNA sequence that Is expressed as a polypeptide, I.e., an amino acid residue sequ- 
ence. 

Vector A DNA molecule capable of replication in a cell and/or to which another DNA segment can be 
operatively linked so as to bring about replication of the attached segment Alternatively, a vector can be a non- 
replicating vector that is integrated into the chromosome of the transformed cell. A plasmid is an exemplary 
30 vector. 

II. The Invention 

The present invention relates to compositions and methods for increasing the accumulation of squalene 

35 and specific sterols in yeast cultures as wed as to the yeast that exhibit increased squalene and sterol accumu- 
lation relative to a non-transformed yeast Preferred yeasts are yeasts of the Saccharomyces or Candida genus. 
A more preferred yeast is Saccharomyces cerevteiae (S. cereyjslae). 

A yeast contemplated by this invention is transformed with an added structural gene that encodes a 
polypeptide having HMG-CoA reductase activity, that encoded polypeptide being expressed in the transformed 

40 yeast Preferred non-transformed yeasts are mutant species having a single or double defect in the expression 
of enzymes involved In converting zymosterol to ergo sterol (sterol btosynthetic pathway enzymes). The non- 
transformed and transformed yeasts compared are of the same species, such as S. cerevisiae. 

. Sterol production in a yeast culture of the present invention is increased by increasing the cellular activity 
of the enzyme HMG-CoA reductase, which enzyme catalyzes the conversion of 3-hydroxy-3-methylgIutaryl 

45 Coenzyme A (HMG-CoA) to mevalonate. As used herein, "cellular activity" means the total catalytic activity of 
HMG-CoA reductase in a yeast cell. 

Cellular HMG-CoA reductase activity is increased by increasing the expression level of a structural gene 
encoding a polypeptide having HMG-CoA reductase catalytic activity. Expression of that encoded structural 
gene enhances the cellular activity of that enzyme. The expression level is increased by methods well known 

so in the art For example, expression of a structural gene is increased by deregulating the promoter, which con- 
trols expression of such a structural gene. The promoter that regulates expression of the HMG-CoA reductase 
gene In a normal, wQd-type yeast can be identified and excised from the genome. A new promoter, which allows 
for overexpression of the HMG-CoA reductase gene. Is then inserted according to standard transformation 
techniques. A preferred means of increasing the expression level of a structural gene needing a polypeptid 

55 having HMG-CoA reductase catalytic activity is to increase the copy number of a structural gene encoding such 
a polypeptide. 

The copy number is increased by transforming a yeast cell with a recombinant DNA mo! ecu I comprising 
a vector operatively linked to an exogenous DNA segment that encodes a polypeptid having HMG-CoA reduc- 
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tase activity, and a promoter suitable for driving the expression of said polypeptid in said yeast Such a 
polypeptide la catalytically active, and is preferably a truncated HMG-CoA reductase protein. 

Thus, a transformed yeast cell has on or more add d g nes that encod a polypeptide having HMG-CoA 
reductase activity reJativ to a non-transformed yeast of the sam species. As such, a transformed yeast can 

5 be distinguished from a non-transformed yeast by standard technology such as agarose separation of DNA 
fragments or mRNAs followed by transfer and appropriate blotting with DNA or RNA or by use of polymerase 
chain reaction technology, as are well known. Relative HMG-CoA reductase activity of the transformed and 
non-transformed yeasts can also be compared, with a relative increase in HMG-CoA reductase activity in trans- 
form ed yeasts being indicative of transformation. 

10 The accumulation of squalen© and specific sterols can also be used to distinguish between non-transfor- 
med and transformed ysastB. 

A. Structural Genes 

16 The present invention contemplates transforming a yeast with a structural gene that encodes a polypeptide 
having HMG-CoA reductase activity. The HMG-CoA reductase enzymes of both animal and yeast cells com- 
prise three distinct amino acid residue sequence regions, which regions are designated the catalytic region, 
the membrane binding region and the linker region. 

The catalytic region contains the active site of the HMG-CoA reductase enzyme and comprises about forty 

20 percent of the total, localized on the COOH-terminal portion of intact HMG-CoA reductase enzyme. The mem- 
brane binding region contains hydrophobic amino acid residues and comprises about fifty percent of the total, 
localized on the NH^ermlnal portion of Intact HMG-CoA reductase enzyme. The linker region connects the 
catalytic and membrane binding regions, and constitutes the remaining about ten percent of the intact enzyme. 
As discussed in greater detail below, only the catalytic region of HMG-CoA reductase is needed herein. 

25 Thus, a structural gene that encodes a polypeptide corresponding to that catalytic region is the minimal gene 
required for transforming yeasts. However, larger polypeptide enzymes and their structural genes are preferred. 
Thus, the present invention contemplates use of truncated structural genes that encode the active catalytic reg- 
ion, or the catalytic region plus at least a portion of the linker region that is free from the membrane binding 
region of HMG-CoA reductase. 

30 A structural gene encoding a polypeptide having HMG-CoA reductase activity can be obtained or construc- 
ted from a variety of sources and by a variety of methodologies, [See, e.g., Carlson etal., Cell, 28:145 (1982); 
Rine et al-., Proa Nat Acad. Sci. U.S.A., 80:6750 (1983)1. Exemplary of such structural genes are the mam- 
malian and yeast genes encoding HMG-CoA reductase. 

The mammalian genome contains a single gene encoding HMG-CoA reductase. The nucleotide base sequ- 

35 ence of the hamster and human gene for HMG-CoA reductase have been described. A composite nucleotide 
sequence of cDNA corresponding to the mRNA, as well as the derived amino acid residue sequence, for hams- 
ter HMG-CoA reductase is found in Chin et al.. Nature. 308:61 3 (1 984) and SEQ ID NO:3. The composite nuc- 
leotide sequence in that paper, comprising about 4606 base pairs. Includes the nucleotide sequence encoding 
the intact hamster HMG-CoA reductase enzyme. 

40 Intact hamster HMG-CoA reductase comprises about 887 amino acid residues, shown in SEQ ID NO:4. 

A preferred structural gene is one that encodes a polypeptide corresponding to only the catalytic region of 
the enzyme. Two catalytically active segments of hamster HMG-CoA reductase have been defined, [Liscum 
at al., J. Biol. Chem.. 260(1 ): 522 (1985)]. One catalytic region has an apparent size of about 63 kDa and com- 
prises amino acid residues from about position 373 to about position 887 of SEQ ID NO:4. A second catalytic 

45 region has an apparent size of about 53 kDa and comprises amino acid residues from about position 460 to 
about position 887 of SEQ ID NO:4. The about 63 kDa catalytically active segment is encoded by base pairs 
from about nucleotide position 1282 to about nucleotide position 2824 of the sequence in SEQ ID NO:3. The 
about 53 kDa catalytically active segment is encoded by base pairs from about nucleotide position 1 543 to about 
nucleotide position 2824 of the sequence In SEQ ID NO:3. 

so In a preferred embodiment, the utilized structural gene encodes the catalytic region and at least a portion 
of the linker region of HMG-CoA reductase. The linker region of hamster HMG-CoA reductase comprises amino 
acid residues from about position 340 to about position 373 or from about position 340 to about position 460, 
depending upon how the catalytic region is defined. These linker regions are encoded by base pairs from about 
nucleoid position 1183 to about nude tid position 1282 or from about position 1183 to about postti n 1543 

55 respectively of th sequ nee in SEQ ID NO:3. The structural gene encoding the linker region Is operatively lin- 
ked to the structural gene encoding the catalytic region. 

In one particularly preferred embodiment a structural gene encoding a catalytically activ , truncated HMG- 
CoA reductase enzyme can optionally contain base pairs encoding a small portion of the membrane region of 

7 
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the nzyme. A truncated hamster HMG-CoA reductase gene, designated HMGR-A227, comprising nucleotides 
164-190 and 1 187-2824 of th sequence In SEQ ID NO:3, which encodes amino acid residues 1-9 (from the 
membrane binding region) and 342-887 has b en us d to transform cells lacking HMG-CoAreductas , [Gflet 
a]., Ceil, 41:249(1985)1. 

5 A structural gene encoding a polypeptid comprising a catalytically active, truncated or intact HMG-CoA 

reductase enzyme from other organisms such as yeast can also be used in accordance with the present inven- 
tion. 

Yeast ceOs contain two genes encoding HMG-CoA reductase. The two yeast genes, designated HMG1 and 
HMG2, encode two distinct forms of HMG-CoA reductase, designated HMG-CoA reductase 1 and HMG-CoA 
10 reductase 2. The nucleotide base sequence of HMG1 (SEQ ID NO:1) as weQ as the amino acid residue sequ- 
ence of HMG-CoA reductase 1 (SEQ ID N03) are presented in Figure 2, reprinted from Basson et at, Mol. 
Cell Biol.. 8(9):3797 (1988). 

The entire HMG1 gene comprises about 3360 base pairs. Intact HMG-CoA reductase 1 comprises an amino 
acid sequence of about 1054 amino acid residues. 
15 The entire HMG2 gene comprises about 3348 base pairs shown in SEQ ID NO: 5. Intact HMG-CoA reduc- 
tase 2 comprises about 1045 amino acid residues shown In SEQ ID NO:6 (Basson et al. t above). 

By analogy to the truncated hamster structural gene, structural genes encoding polypeptides comprising 
catalytfcally active, truncated HMG-CoA reductase enzymes from yeast can also be used in accordance with 
the present invention. 

20 The catalytic region of HMG-CoA reductase 1 comprises amino acid residues from about residue 618 to 
about residue 1054: i.e., the COOH-terminus. A structural gene that encodes the catalytic region comprises 
base pairs from about nucleotide position 1974 to about position 3282 of Figure 2 and SEQ ID NO:1. 

The linker region of HMG-CoA reductase 1 comprises an amino acid sequence from about residue 525 to 
about residue 617. A structural gene that encodes the linker region comprises nucleotides from about position 

25 1 695 to about position 1 974 of Figure 2. A structural gene encoding a polypeptide comprising the catalytic reg- 
ion and at least a portion of the linker region of yeast HMG-CoA reductase 1 preferably comprises the structural 
gene encoding the tinker region of the enzyme operativety finked to the structural gene encoding the catalytic 
region of the enzyme. 

Also by analogy to the truncated hamster gene, a truncated HMG1 gene can optionally contain nucleotide 
30 base pair sequences encoding a small portion of the membrane binding region of the enzyme. Such a structural 
gene preferably comprises base pairs from about nucleotide position 121 to about position 146 and from about 
position 1695 to about position 3282 of Figure 2 and SEQ ID NO:1. 

A construct similar to those above from an analogous portion of yeast HMG-CoA reductase 2 can also be 
utilized. 

35 It wOl be apparent to those of skill in the art that the nucleic acid sequences set forth herein, either explicitly, 
as In the case of the sequences set forth above, or implicitly with respect to nucleic acid sequences generally 
known and not presented herein, can be modified due to the built-in redundancy of the genetic code and non- 
critical areas of the polypeptide that are subject to modification and alteration. In this regard, the present inven- 
tion contemplates allelic variants of structural genes encoding a polypeptide having HMG-CoA reductase 

40 activity. 

The previously described DNA segments are noted as having a minimal length, as well as total overall 
lengths. That minimal length defines the length of a DNA segment having a sequence that encodes a particular 
polypeptide having HMG-CoA reductase activity. As is well known in the art, so long as the required DNA sequ- 
ence is present and in proper reading frame, (including start and stop signals), additional base pairs can be 

45 present at either end of the segment and that segment can stOi be utilized to express the protein. This, of course, 
presumes the absence in the segment of an operatively linked DNA sequence that represses expression, exp- 
resses a further product that consumes the enzyme desired to be expressed, expresses a product other than 
the desired enzyme or otherwise interferes with the structural gene of the DNA segment 

Thus, so long as the DNA segment is free of such interfering DNA sequences, the maximum size of a recom- 

60 binant DNA molecule, particularly an expression vector, is governed mostly by convenience and the vector size 
that can be accommodated by a host cell, once all of the minima] DNA sequences required for replication and 
expression, when desired, are present Typically, a DNA segment of the Invention can be up to 15,000 base 
pairs in length. Minimal vector sizes are well known. 

55 B. Recombinant DNA M I cules 

A recombinant DNA molecule of the present invention can b produced by peratively linking a vector to 
a useful DNA segment to form a plasmid such as discussed herein. Particularly preferred recombinant DNA 
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molecules are discussed in d taQ in Examples 2 to 7, hereafter. A vector capable of directing th expression 
of a polypeptide having HMG-CoA reductase activfty Is referred to herein as an "expression vector". 

Such expression vectors contain expression control elements including the promoter. Th polypeptide cod- 
ing genes are operatively linked to the expression vector to permit the promoter sequ nee to direct RNA 

5 polymeras binding and expression of the desired polypeptide coding gene. Useful in expressing the polypep- 
tide coding gene are promoters that are Inducible, viral, synthetic, constitutive as described by Poszkowski at 
al. t EMBO J„ 3:2719 (1989) and Odell et aj., Nature. 313:810 (1985), and temporally regulated, spatially regu- 
lated, and spatiotemporalry regulated as disclosed in Chau et al. f Science. 244:174-181 (1989). The promoter 
preferably comprises a promoter sequence whose function in regulating expression of the structural gene is 

io substantially unaffected by the amount of sterol In the cell. As used herein, the term "substantially unaffected" 
means that the promoter is not responsive to direct feedback control by the sterols accumulated in transformed 
cells. 

A promoter is also selected for its ability to direct the transformed yeast's transcriptional activity to the struc- 
tural gene encoding a polypeptide having HMG-CoA reductase activity. Structural genes can be driven by a 
16 variety of promoters in yeast 

Promoters utilized with the present invention are those preferably regulated by factors, which can be moni- 
tored and controlled in the Internal or external environment of the transformed cell. Examples of promoters 
InduclWy regulated by factors In the cell's external environment (extrinsic factors) are the GAL 1 promoter, the 
GAL 10 promoter, the GAL 1-10 promoter, the GAL 7 promoter, the metallothionine promoter, the a-factor pro- 
20 motor, the invertase promoter and the enolase promoter. Preferred are the well known GAL 1, the GAL 10 and 
the GAL 1-10 promoters. 

Examples of promoters subject to inducible regulation by factors in the cell's internal environment (Intrinsic 
factors) are the phosphoglycerate kinase (PGK) promoter, the triose-phosphate isomerase (TPI) promoter, the 
alcohol dehydrogenase (ADH) promoter and the repress ible acid phosphatase promoter. Preferred are the well 

25 known PGK and the ADH promoters. 

The choice of which expression vector and ultimately to which promoter a polypeptide coding gene Is opera- 
tive ly linked depends directly on the functional properties desired, e.g. the location and timing of protein exp- 
ression, and the host cell to be transformed. These are well known limitations inherent in the art of constructing 
recombinant DNA molecules. However, a vector useful in practicing the present Invention is capable of directing 

30 the expression of the polypeptide coding gene included in the DNA segment to which it is operatively linked. 
The present method contemplates a plasmid vector. The plasmid vectors of the present invention can be 
incorporated either within (integrated) or without (episomal) the chromosomes of the transformed cell. An 
episomal plasmid Includes an origin of replication for yeast, the nucleic acid sequence that encodes a polypep- 
tide having HMG-CoA reductase activity, a promoter, and a selective marker. The selective marker can include 

35 genes conveying antibiotic resistance, or permitting an auxotrophic host to metabolize a substrate that it would 
not otherwise be able, but for the presence of the plasmid vector. However, the use of antibiotic resistance as 
a selective marker requires growing organisms In an antibiotic culture media Due to the expense of the anti- 
biotic, organisms dependent on antibiotics are difficult to develop commercially. Generally, auxotrophic organ- 
isms are used for yeast 

40 Auxotrophic organisms can be produced by mutation and culture techniques which are well known in the 
art Selective markers which can complement an auxotrophic host organism Include the well known TRP 1 gene 
encoding phosphoribosyi anthraniline isomerase, the LIRA 3 gene encoding orotine-5' phosphate decarboxyf- 
ate, the LEU 2 gene encoding tsopropyimalate isomerase, and the HIS 3 gene encoding histidinol dehydrogen- 
ase. A preferred selective marker for an auxotrophic host is TRP 1 . Preferred episomal plasmid vectors are 

45 pSOC725ARC and p$OC106ARC. 

Episomally replicating vectors are sometimes difficult to maintain in host organisms for long periods of time 
in liquid culture, especially when the selective pressure used to maintain the vector is complementation of a 
nutritional auxotrophy. A preferred embodiment of the present invention includes an integrating vector which - 
requires little or no selective pressure to maintain base sequences for the polypeptide having HMG-CoA reduc- 

50 tase activity and the promoter. 

Integrating vectors, in accordance with the present invention, include base sequences that encode a 
polypeptide having HMG-CoA reductase activity, a promoter, a selective marker and sequences homologous 
to host chromosomal DNA that permit the base sequences to be Incorporated within the chromosome via 
homologous recombination. The homologous region includes restriction sites that permit the plasmid to become 

55 linear. In linear form, th plasmid can recombin at homologous regions of th chromosome. Integrating vectors 
do not Include origins of replication for th host organism. 

Preferred integrating vectors are pARC300S, pARC300T, pARC300D, pARC306E and pARC304S. Plas- 
mid vector pARC304S is most preferred as evidenced by its ability to generate the greatest enhancem nt in 
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sterol accumulati n( ee Example 15). Th basic genetic charact ristics of preferred plasmid vectors are sum- 
marized In Tab) 2, below. 

TABLE 2 

5 

piaamid vector genetic Characteristics 

PSOC106 TRPl-2Mori-GAL 1-HMG1* 

10 pSOC725 TRPl-2/JOri-GAL 10-tHMGl** 

pARC306B 'TRPl-GAL 1-tHMGl 

pARC300D TRPl-PGK-tHMGl 



16 



20 



25 



55 



pARC300S,T URA3-PGK-tHKGl-ura3 term 

pARC304S TJRA3-ADH-tHMGl-ura3 term 



* HMG1 - gene encoding intact £• cerevisiae HMG-CoA 
reductase 1. 

** tHMGl - gene encoding catalytic region and a portion 
of the linker region of £• cerevisiae HMG-CoA 
reductase 1. 

30 Individuals skated in the art will readily recognize that episomal and integrating vectors are often amplified 
in organisms other than the intended host and require means of replication and selection in the non-host organ- 
ism. Generally, the non-host organism ia Escherichia coll due to Its well-known features and characteristics. 

In preferred embodiments, the vector used to express the polypeptide coding gene Includes a selection 
marker that is effective in a yeast cell, such as the URA 3 or TRP I markers. Other suitable selection means 

35 for use in amplifying the vectors in bacteria include antibiotic markers, such as genes encoding for beta lac- 
tamase (penicillin resistance), chloramphenicol transacetytase (chloramphenicol resistance), and neomycin 
phosphotransferase (kanamycin and neomycin resistance). 

A variety of methods has been developed to operatively link DNA to vectors via complementary cohesive 
termini or blunt ends. For instance, complementary homopolymer tracts can be added to the DNA segment to 

40 be inserted and to the vector DNA. The vector and DNA segment are then joined by hydrogen bond ing between 
the complementary homo polymeric tails to form recombinant DNA molecules. 

Alternatively, synthetic linkers containing one or more restriction endonudease sites can be used to join 
the DNA segment to the expression vector. The synthetic linkers are attached to blunt-ended DNA segments 
by incubating the blunt-ended DNA segments w&h a large excess of synthetic linker molecules in the presence 

45 of an enzyme that is able to catalyze the ligation of blunt-ended DNA molecules, such as bacteriophage T4 
DNA ligase. Thus, the products of the reaction are DNA segments carrying synthetic tinker sequences at their 
ends. These DNA segments are then deaved with the appropriate restriction endonudease and ligated into 
an expression vector that has been deaved with an enzyme that produces termini compatible with those of the 
synthetic linker. Synthetic linkers containing a variety of restriction endonudease sites are commercially avaQ- 

50 able from a number of sources induding New England BioLabs, Beverly, MA. 

Also contemplated by the present invention are RNA equivalents of the above described recombinant DNA 
molecules. 



C. Transformed Yeasts and Methods of Transformation 

The copy number of a gene coding for a polypeptide having HMG-CoA reductas activity Is Increased by 
transforming a desired yeast with a suitable vector that contains that structural gene. Expression of that g ne 
in the transformed yeast enhances the activity of HMG-CoA reductase. 

10 
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Yeast cells are transformed in accordance with the present invention by methods known and readily appa- 
rent to those of skill in the yeast transformation art, [See, .g. f HInnen et al., Proa Natl. Acad. Set USA. 75:1929- 
(1978); Ito et al., Bad, 5:163-168 (1983)]. 

A preferred general method of transformation is the lithium acetat procedure of Ito t al M abov . Yeast 
5 ceil are grown to a concentration about 2 X 10 7 celts/ml in a medium containing yeast extract, bactopeptone 
and dextrose. Ceils are collected by low speed centrtficatton and resuspended in a transformation medium con- 
taining lithium acetate in a Tns-EDTA buffer. 

Ceils are maintained in the transformation medium for about one hour at about 30°C. Recombinant DNA 
molecules of desired composition are added to the transformation medium cell suspension and the mixture Is 
w maintained at about 30°C for about one-half hour. Polyethylene glycol (M.W. 4000) Is then added to the cell 
suspension such that the final concentration of polyethylene glycol is about 35 percent weight/volume (w/v). 
Cells are maintained in the polyethylene glycol-containing solution at about 30°C for about two hours and then 
at about 42°C for an additional five minutes. Sterile distilled water is added to the cell suspension, and the cells 
collected by low speed centrtficatlon. Further specifics are provided hereinafter. 
16 Successfully transformed ceils are identified by growing the transformed cells on selection medium, iden- 
tifying cell characteristics indicative of transformation (i.e., increased accumulation of squatene or specific 
sterols), analyzing nucleic acids Isolated from such transformed cells with standard techniques such as South- 
em blot analysis, [Holm et al., Gene. 42:169 (1986)]. 

20 D. Mutated Yeasts 

The yeasts utilized in accordance with the present invention are mutated yeasts having single or double 
defects in the expression of enzymes that catalyze the conversion of zymosterol to ergosterol. Such enzymes 
are referred to herein as "erg" gene products. Table 3 below lists the particular erg designations for specific 
25 enzyme expression defects. 

Table 3 

Enzyme Expression Defect Mlltflnt Designation 

30 

zymosterol -2 4 -methyl transferase erg6 
ergosta-5 ,7,24(28) -trienol- 

2 2 -dehydrogenase ergS 
35 episterol-5*dehydrogenase erg3 

Mutants used in accordance with the present invention can be purchased or generated from commercially 
available sources such as the Yeast Genetic Stock Center (Berkeley, CA.). For example, erg5 and erg5-erg6 
40 double mutants are produced from commercially available sources. 

Mutant yeast ATC0402mu, an erg5-erg6 double mutant, is constructed by crossing a commercially avail- 
able erg6 mutant yeast, M610-12B, with a commercially available ergS mutant, po!5aA22, and then crossing 
the resultant double mutant, ATC0403mu, with a wild-type yeast Mutant yeast ATC0402mu and its derviative 
mutant yeast ATC031 5rc are the most preferred mutants for transformation with the plasmid vectors of the pre- 
45 sent invention. 

Alternatively, ATC0403 is crossed with a different wild-type, and mutants having desired genotypes are 
back-crossed twice with wild-type yeast to yield species ATC4124, an erg5 mutant 

Mutants are also obtained by well known methods of inducing mutations, See , e.g., Boeke et al., Mol. Gen. 
Genet. 197:345-346 (1 984); Sherman et al.. Methods and Yeast Genetics. Cold Spring Harbor Laboratory, N.Y. 
so (1986). 

in a preferred embodiment, wild-type yeasts are transformed with an inducible "TY1-neo" trans poson as 
a mutagenic agent Plasmid pJEF1 105, containing a GALTY1-neo expression cassette, is used as the trans- 
forming agent Boeke et al., Science 239:280-282 (1989). Competent transformants demonstrating both 
neomycin and nystatin resistance are then evaluated for sterol content 
55 Transf rmation of wild-type yeast with pJEF1105 yield mutant ATC6118, an erg3 mutant, and mutant 
ATC0501, an erg6 mutant 

Mutants having single expression defects are then crossed to generate mutants having double defects in 
enzyme expression. For example, th crossing of mutant ATC6118 with mutant ATC0501 yields mutant 
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ATC6119, an erg3-erg6 doubl mutant 

The genotype of exemplary mutants contemplated for us in the present Invention are presented In Table 
4 below. Genotype symbols are used in accordance with convention cited in Mortimer et al. Yeast, 5:321-403 
(1989) and Broach. The Molecular Biology of the Y ast Saccharomyces. Life Cyd andlnh ritance . Strath m, 
5 Jones and Broach, eds., Cold Spring Harbor Laboratory, pp. 653-727 (1981 ). 

Table 4 

genotype 

a, erg5 

a, ile3, crg6-5 , trpl, gal2 

a, adel, ura3-52, lau2-100, leu2-122, 
MEL, gal 1 gal 10 

a, trpl-A, his3A-200, ura 3*52, lys 2 

a, trpl, gal, erg5, «rg6 

a, trpl, GAL, arg5, «rg6 

a, his3A-200, erg3, ura3-52, GAL 

a, ergs, trpl, GAL 

a, ura3-52, erg7, gal 

a, erg3, erg6, ura3-52, GAL 

a, erg5, erg 6 

a, ura3, erg5, erg6 

a, erg5, erg 6 

E. Squalene and Sterol Accumulation in Transformed Yeast 

The transformed mutant yeast species of the present invention overaccumulate squalene and specific 
sterols relative to non-transformed mutants of the same species. Relative to a non-transformed erg3 mutant, 
an erg3 mutant transformed with a pi asm id vector used herein overaccumulates squalene, ergosta-8,22-dlenol, 
ergosta-7,22-dienol, ergosta-6-enol and ergosta-7-enol. 

Relative to a non-transformed erg5 mutant, an erg5 mutant transformed with a plasmid vector used herein 
overaccumulates squalene, zymosterol, and a mixture of ergosta-5,7,24(28)-trienol and ergosta-5,7 dienol. 

Similar results are seen when mutants having double defects in enzymes of the sterol synthetic pathway 
are transformed. Relative to a non-transformed erg3-erg6 mutant, an erg3-erg6 mutant transformed with a use- 
ful plasmid vector overaccumulates squalene, zymosterol and cholesta-7,24-dienol. 

Relative to a non-transformed erg5-erg6 mutant, an erg5-erg6 double mutant transformed with the plasmid 
vector useful herein overaccumulates squalene, zymosterol, cholesta-5,7 l 24-trfenol and choJesta-7,24-dienol. 

F. HMG-CoA Reductase Activity In Transformed Yeasts 

The expression of a structural gene encoding a polypeptide having HMG-CoA reductase activity In the 
transformed yeast of the present invention enhances the cellular activity of said HMG-CoA reductase. As a 
so result of transformation, the copy number of an added gene encoding a polypeptide having HMG-CoA reductase 
activity is increased from 1 to about 2 to about 10. 

Cellular activity of HMG-CoA reductase in such transformed cells Is almost linearly proportional to the 
increase in copy number through a copy number of about 6 and then falls slightly when a copy number of 9 is 
reached. Thus, when the copy number is increased to about 2, HMG-CoA reductase activity is elevated to a 
55 level about 1 .4 times the activity observed in non-transformed yeast A further increase in the copy number to 
a level of about 6 Is accompanied by a further Increase In HMG-CoA reductase activity to a level about 2.6 
times that found in non-transformed yeast Increases In the copy number beyond about 6 to about 9 are not 
accompanied by further increases in HMG-CoA reductase activity. A transformed yeast having a copy numb r 
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pol5aA22 
M61O-120 
DBY745 

YNN281 

ATC0403mu 

ATC0402mu 

ATC6118 

ATC4124 

ATC4154 

ATC6119 

ATC1500cp 

ATC0315rc 

ATC1551 
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of about 9 has a level of HMG-CoA reductase activity about equaJ to about twice that seen in non-transf rmed 
yeast 

O. Harvesling of Sterols 

5 

If desired, transformed yeasts are harvested to recover the sterol product Most of the sterol In our geneti- 
cally transformed yeast of this invention occurs in the form of fatty add esters. To obtain free sterols, it is there- 
fore necessary to saponify the "yeast pulp" in base, e.g., as described in the Examples below (2:1 EtOH/H 2 0 
containing 20 percent w/v KOH). 
10 In a preferred embodiment, harvesting comprises: 

(i) homogenizing sterot-oontaining transformed yeasts to produce a pulp; and 

(ii) extracting the steroi(s) from the pulp with an appropriate basic solvent such as an organic solvent or 
by supercritical extraction followed by base saponification in an appropriate solvent [Favati et ai., J. Food 
Scl. , 53:1532 (1988) and the citations therein] to produce a sterol-contalnlng liquid solution or suspension; 

15 and 

(iii) isolating the sterol(s) from the solution or suspension. 

Transformed yeasts are homogenized to produce a pulp using methods well known to one skflled in the 
art This homogenizatlon can be done manually, by a machine, or by a chemical means. The pulp consists of 
a mixture of the sterol of interest, residual amounts of precursors, cellular particles and cytosol contents, which 

20 is subjected to extraction procedures. 

Sterols) can be extracted from the pulp produced above to form a sterol-contalning solution or suspension. 
Such extraction processes are common and wefl known to one skilled in this art For example, the extracting 
step can consist of soaking or immersing the pulp in a suitable solvent This suitable solvent is capable of dis- 
solving or suspending the sterol present in the pulp to produce a sterol-containing solution or suspension. Sol- 

25 vents useful for such an extraction process are well known to those skilled in the art and Include several organic 
solvents and combinations thereof such as methanol, ethanol, isopropanol, acetone, acetonitrile, tetrahydrofu- 
ran (THF), hexane, and chloroform as well as water-organic solvent mixtures. A vegetable oil such as peanut 
com, soybean and similar oOs can also be used for this extraction. 

Yeasts transformed with a structural gene for an active, truncated HMG-CoA reductase enzyme are grown 

30 under suitable culture conditions for a period of time sufficient for sterols to be synthesized. The sterol-con- 
taining yeast cells are then tysed chemically or mechanically, and the sterol is extracted from the lysed cells 
using a liquid organic solvent as described before, to form a sterol-containing liquid solution or suspension. 
The sterol Is thereafter Isolated from the liquid solution or suspension by usual means such as chromatography. 
The sterol is isolated from the solution or suspension produced above using methods that are well known 

35 to those skilled in the art of sterol isolation. These methods include, but are not limited, to, purification procedures 
based on sciubiity in various liquid media, chromatographic techniques such as column chromatography and 
the like. 

Best Mode For Carrying Out The invention 

40 

The following examples Illustrate the best mode of carrying out the invention and are not to be construed 
as limiting of the specification and claims in any way. 

EXEMPLE 1: Transformation of S. Cerevislae 

45 

Yeast of the species S. cerevisiae were transformed in accordance with a lithium acetate procedure, pto 
et al., J. Bacterid.. 153: 1 63-1 68 (1 983)]. Yeast cells were grown in about 50 ml ofYEPD medium (yeast extract 
1 percent w/v, bactopeptone, 2 percent w/v; and dextrose, 2 percent w/v) overnight at about 30°C. When the 
concentration of cells was about 2 x 10 7 cells/rnl, the ceOs were collected by low speed centrifugation. Cells 
so appearing in the peDet of the centrifugation were suspended in about 50 mis of TE buffer (1 0 mM Tris-CI, 1 mM 
EDTA) and repelleted by centrifugation. The pellet from this second centrifugation was resuspended in about 
1.0 ml of TE buffer. To 0.5 rrd of this cell suspension were added 0.5 ml of 0.2 M lithium acetate (UOAc), and 
the suspension was maintained at about 30°C for one hour with constant shaking. 

Recombinant DNA (about 10 ug in up to 15 pi of TE buffer) was added to 100 ui of the TE-UOAc ceil sus- 
55 p nsion and the admixture maintained at about 30°C for one-half hour without shaking. Th DNArContatning 
cell suspension was then well mixed with polyethylene glycol (44 percent w/v) such that the final concentration 
of polyethylene glycol (PEG) was about 35 percent (w/v). 

The cells were maintained in this PEG solution at about 30°C for about two h urs and then at about 42°C 
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for about five minutes. About 10mtof sterte, distaed water was added to each susp nsionandth cells were 
collected by low speed centrtfugatlon. This procedure was repeated, and the collected cells were dispersed In 
about 1 .0 ml of distilled water. Approximately 1 00 to 200 uJ of this suspension were then spread-plated on selec- 
tiv medium. 

5 Transformation of cells was confirmed by growth on selection medium, identification of cell characteristics 
indicative of transformation (i.e.. Increased levels of selected sterols or squalene), and Southern blot analysis 
of nucleic acid isolated from such transformed cells [Holm et al., Gene , 42:169-173 (1986)]. 

EXAMPLE 2l Construction of Episomal Plasmid pSOC72SARC 

10 

Plasmid pSOC725ARC (See Figure 3) was constructed to place a coding sequence for a truncated HMG1 
gene under control of the GAL 1 portion of a GAL 1-10 promoter. Plasmid pSOC725ARC also contains the TRP 
Igene and the yeast 2 micron origin of replication (IR1). This plasmid was prepared from intermediate plasmids 
as follows. 

16 The TRP 1-ARS gene of S. cerevisiae was removed from plasmid YRP1 2 [Stinchcomb et al. Nature. 282:39 
(1 979)] by digestion with Eco Rl. The 1445 base pair DNA fragment containing the TRP 1-ARS gene was puri- 
fied on an agarose gel and ligated into plasmid pUC8 (Vlera et al. t Gene . (1982)), which had been digested 
with Eco Rl to form plasmid pSOC742. 

A yeast episomal replication origin, obtained from purified S. cerevisiae two-micron plasmid DNA, was 
20 digested with Eco Rl and then treated with the Klenow fragment of E. coli DNA polymerase 1 to yield an about 
2240 base pair fragment containing the two-micron origin of DNA replication. The about 2240 base pair frag- 
ment was purified by agarose gel electrophoresis and ligated into plasmid pUC8, which had been digested with 
Sma I to form plasmid pSOC743. 

Plasmid pSOC742 was cleaved with Bam HI and Bgl II to yield an 857 base pair, TRP 1 - containing gene 
25 fragment, which was Inserted into pSOC743 that had been cut with Bam HI to form plasmid pSOC744. 

The MELI gene was removed from plasmid pMP550 [Summer-Smith et al., Gene . 36:333-340 (1 985)] with 
restriction endonucleases Eco Rl and Bam HI, and the about 2858 base pair restriction fragment containing 
MEL1 was purified on an agarose gel. The purified fragment was then ligated into plasmid pUC8 which had 
been digested with Eco Rl and Bam Hi to form plasmid pSOC741. 
30 The final stage of assembly of pSOC740 was achieved by purifying an about 3101 base pair, Eco Rl res- 
triction fragment of pSOC744 that contained the TRP 1 and two-micron origin, and I (gating it into Eco Rl-cleaved 
plasmid pSOC741 to form plasmid pSOC740. 

The GAL 1-10 promoter was excised from pBM258, [Johnston et al., Proa Natl. Acad. Sci. USA, 79:6971- 
6975 (1982)] as a 685 base pair Bam Hl-Eco Rl restriction fragment, and ligated into pUC18, which had been 
35 digested with Bam HI and Eco Rl to form plasmid pSOC71 1 . 

Plasmid pSOC740 was digested with Eco Rl and the resulting 31 01 base pair fragment, containing the two- 
micron origin of replication and the TRP 1 gene, was isolated and ligated Into the Eco Rl digested plasmid 
pSOC71 1 to produce plasmid pSOC712, in which the TRP 1 gene is proximal to the GAL 1-10 promoter. 
A Pst I restriction site spanning the coding sequence for amino acid residues 529-530 of HMG-coA reduc- 
40 tase 1 was chosen as the point at which to introduce both a new Bam HI restriction site and a new initiator 
methionine codon. A 1 706 base pair Pst l-Eco Rl restriction fragment, containing the coding sequence for the 
COOH-terminal half of HMG-CoA reductase 1 , was purified from a digest of pJR59, [Basson et al., Proc. Natl. 
Acad. Sci. USA. 83:5563-5567 (1986)]. This purified pJR59 fragment and a synthetic oligonucleotide: 

45 d5 • -GATCCGTCGACGCATGCCTGCA-3 • (SEQ ID NO: 7) 

d3 • -GCAGCTGCGTACGG-5 • (SEQ ID NO: 8) 

were ligated with pUC18 [Yanisch-Perron et al„ Gene, 33:103-119 (1985)], which had been cleaved with Bam 
60 HI and Eco Rl. 

The resulting plasmid, pSOC937, contained a Bam HI restriction site 12 base pairs upstream of the trun- 
cated HMG-CoA reductase coding sequence Initiator methionine. The polypeptide formed from Initiation at that 
point had initial methionine and proline residues followed by amino acid residues 530 through 1054 of the 
natural HMG-CoA reductase 1. 
55 Th Eco Rl restriction sit , which is at the 3' nd of the gene, is located 135 base pairs past th end of the 
coding sequence for th truncated HMG-CoA reductase protein. The truncated gene for HMG-CoA reductase 
was placed into plasmid pSOC712 by converting the Eco Rl site at the 3' end of the truncated reductas gene 
to a Bam HI site (Klenow polymerase filled, ligated to an oligonucleotide, d5-CGGATCC , sp cifying the Bam 
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HI restriction site) and cleaving the preparation with ndonudeaae Bam HI. A purified, resulting 1728 base pair 
Bam HI ended restriction fragment from pSOC937 was llgatad into the Bam Hl-dtgested pSOC712 to produce 
plasmid pSOC725ARC, whose schematic restriction map is shown in Figure 3. 

5 EXAMPLE 3: Construction of EpisomaJ Plasmid pSOC108ARC 

Plasmid pSOCIOGARC (See Figure 4) was constructed to place a coding sequence for intact HMG1 under 
the control of the GAL 1 portion of a GAL 1-10 promoter. 

A 610 base pair Bgl II fragment from pJR59 (about positions 9026-9636), containing the DNA surrounding 
w the beginning of the HMG-CoA reductase coding sequence, was Isolated and further restricted with Ode I to 
provide a DNA fragment (about positions 9151-9636) starting 68 base pairs upstream of the first codon of the 
HMG-CoA reductase coding sequence. 

The Dde I and Bgl II fragments were treated with the Wenow fragment of DNA polymerase to render the 
ends "blunt" The fragments were then llgatad to oligonucleotide linkers, d5'-CCGGATCCGG-3 (SEQ ID NO:9), 
15 specifying a Bam HI cleavage site (BRL linkers). The ligated fragments were digested with Bam HI to produce 
ligateable Bam HI restriction ends, and the resulting 499 base pair fragment containing the start of the HMG- 
CoA reductase coding sequence waa ligated into Bam HI digested pBR322 to form plasmid pSOC104. 

The remainder of the HMG-CoA reductase coding sequence was reconstructed downstream of the new 5' 
Bam HI site by ligating a 1477 base pair Xba L-Sac I D NA fragment of pJR59 ( which specifies the 5' half of the 
20 HMG-CoA reductase coding sequence, and a 2101 base pair Sac I -Sal I fragment of pJR59, which specifies 
the 3' half of the HMG-CoA reductase coding sequence, into pSOC104 digested with Xba I and Sal I to form 
plasmid pSOC105 containing a 3903 base pair Bam Hl-Sai I restriction fragment having the entire coding sequ- 
ence for HMG-CoA reductase. This 3903 base pair fragment was ligated into Bam Hl-Sal l-restricted pSOC712 
(See Example 2) to form plasmid pSOC106ARC. 
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EXAMPLE 4: Construction of Integrating Plasmid pARC306E 



Plasmid pARC306E (See Figure 5) was constructed to place a coding sequence for truncated HMGI under 
control of the GAL 1 portion of a GAL 1-10 promoter. 

30 Plasmid pARC306E contains the S. cerevisiae TRP 1 gene and a GAL 1 promoter-driven, truncated HMG- 
CoA reductase gene housed on an E. coli replicon, which specifies ampiciliin resistance. There are no S. cere- 
visiae replicators on plasmid pARC306E. Unique restriction sites within both the TRP 1 gene (Eco RV, position 
865) and the truncated HMG-CoA reductase gene (Cla I, position 4280) serve as sites for the generation of 
linear plasmkJs with DNA homologous to S. cerevisiae chromosomal DNA on both sides of the restriction site. 

35 Thus, plasmid pARC306E can be incorporated into the chromosome at either site via homologous recombi- 
nation. 

The multiple restriction recognition site of plasmid pUC8, located between the Eco Rl and Hind III sites, 
was replaced by the oligonucleotide: 

40 dS 1 -AGCTTTCGCGAGCTCGAGATCTAGATATCGATG (SEQ ID NO: 10) 

3 1 -AGCGCTCGAGCTCTAGATCTATAGCTACTTAA-5 • 

(SEQ ID NO: 11) 

45 to create plasmid pUC8NL. which has a single restriction site for the nuclease enzyme Cla I. 

Plasmid pSOC712 (See Example 2) was digested with Eco Rl and the fragments treated with nuclease S1 
and bacteriophage T4 DNA polymerase plus deoxynudeotides to remove the overhanging 5' Eco Rl restriction 
ends. These ends were ligated to the oligonucleotide: 

d5'-CATCGATG-3' 

so d3'-GTAGCTAC-5' 

and the fragments treated with da I nuclease to produce Cla I restriction ends. 

The resulting 31 08 base pair Cla l-Cta I fragment containing the yeast TRP 1 gene and the two-micron 
replicator, was purified by gel electrophoresis and ligated into pUC8NL, which had been deaved with Cla I, to 
create plasmid pARC300A. 

55 A 2031 base pair fragment containing the two-micron replication origin was removed from pARC300A by 
treatment with nudease Pst I. The resulting mod Vied plasmid pARC300A was treated with nudeas S1 and 
bacteriophag T4 DNA polymerase plus deoxynudeotides to remove the Pst I restriction overhangs and with 
calf intestinal alkalin phosphatase to disallow redosure of th plasmid. The modified pARC300A plasmid was 
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cdigated with the olig nucleotide: 

d5'-CATCGATG-3' 
d3'-GTAGCATC-5 / 

to introduce a Qa I sitejust downstream (to the 3' nd)of the TRP 1 g n to form a pi as mid, and then dosed 
5 to form pARC306B. The TRP 1 gene was separate from yeast replicators, and bounded by Qa I restriction sites. 
Plasmld pARC306B was digested wtth Cla I, purified by DoJyacrylamlde gel electrophoresis and the da 
l-CIa I restriction fragment was introduced Into piasmid pUC8, which had been cleaved with nuclease Acc I, to 
form plasmld pARC306C. 

As the integration of exogenous DNA into yeast chromosomes is best carried out using homologous reconv 
10 bi nation, a dispensable fragment of yeast DNA was desired. This DNA would be used to drive homologous 
recombination if for some reason, recombination at the TRP 1 or HMG-CoA reductase gene were not utilizable. 
The DNA chosen for this purpose was the HIS3 gene. 

An 1800 par Bam Hl-Bsm HI restriction fragment was removed from piasmid YEP6 [Struhl et al. t Proa 
Natl. Acad. Sd. USA. 76:1035 (1979)] and Introduced Into plasmld pARC306C, which had been deaved with 
15 Bam HI, to create plasmld pARC306D. Piasmid pSOC725 (See Example 2) was digested with Eco Rl to yield 
e GAL 1-10 promoter linked to a truncated HMG-CoA reductase gene, which was then inserted into Eco Rl- 
digested piasmid pARC306D, to form piasmid pARC306E. 

EXAMPLE 5: Construction of Integrating Piasmid PARC300D 

20 

Plasmld pARC300D (See Figure 6) was constructed to place a coding sequence for a truncated HMG1 
gene under the control of a PGK promoter. This piasmid was prepared from intermediate plasmids as follows. 

Piasmid pSOC61 1 was constructed to determine the efficacy of the mouse metailothionine promoter as a 
transcriptional driver for the truncated HMG-CoA reductase gene in yeast Construction of pSOC611 began 
25 with restriction of piasmid pSOC744 (See Example 2) with Eco Rl endonudease, followed by treatment with 
Klenow Polymerase I and deoxynudeotide triphosphates to fill In the Eco Rl restriction ends. The resulting about 
3101 base pair 2-micron- and TRP 1 - containing fragment of pSOC744 was ligated to pUC18 which had been 
d saved with Hinc II, to form piasmid pSOC517. 

Piasmid pSOC517 was then deaved with Kpn I and Eco Rl and the mouse metailothionine promoter was 
30 inserted as a Kpn l-Eco Rl restriction fragment to form piasmid pSOC518. This promoter region is composed 
of the Kpn I to Bgl II fragment originally in pJYMMT (e) [Hammer et al„ Journal of Applied Molecular Genetics, 
Vd. 1:273 (1982)] as well as a short Bgf II, Eco RJ DNA fragment of unknown sequence. 

The truncated HMG-CoA reductase gene was added to pSOC518 in two steps. First, the truncated HMG- 
CoA reductase gene was removed from pSOC725 as a Bam HI restriction fragment This fragment was then 
35 ligated into M13mp7 which had been deaved with Bam HI. The new M13 derivative formed was designated 
pSOC610. The truncated HMG-CoA reductase gene was removed from pSOC610 as an Eco Rl fragment and 
Inserted Into Eco Rl-digested plasmld pSOC518. The resulting plasmld was designated pSOC611. 

Piasmid pUC8 was partially digested with restriction endonudease Hae II and religated. Transformants 
arising from this procedure were screened to find a piasmid missing the Hae II restriction fragment containing 
40 the portion of the lac operon which was originally present in piasmid pUC8. This new piasmid was designated 
pSOC505ARC. Restriction sites for the endonudeases Eco Rl, Hind III and Kpn I were Introduced Into the Nde 
I site of piasmid pSOC505ARC by ligation of the digonudeotide: 

d5 1 -TATCGAATTGAAGCTTGGTACCGA-3 1 (SEQ ID NO: 12) 
45 3 * -AGCTTAAGTTCGAACCATGGCTAT-5 9 (SEQ ID NO: 13) 

into Nde l-digested pSOC505ARC to form piasmid pARC303A. 

To form the new multi-doning site, the normal multi-doning she present in M1 3mp1 8 was altered by ligating 
so the oligonucleotide: 

d5 • -GATCCAGCTGTGTAC-3 1 (SEQ ID NO:14) 
d3 ■ -GTCGACA-5 • 

55 

into Bam Hl-Kpn I digested M13mp18. This resulted In an altered M13 virus, designated pARC303B. This con- 
struct lacked both th Kpn I and Sma I sites normally found in the M13mp1 8 multi-doning site. The new mul- 
ti-doning site was removed as an Eco Rl, Hind III restrict! n fragment from pARC303B, and was ligated into 
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Eco Rl t Hind III restricted plasmid pARC303A to form plasmid pARC303C. 

In addition to a variation in the normal array of sites Included in the multi-cloning site, another smaller mul- 
ti-d ning site was introduced into the vector, at a point some distance away from th first multi-cloning site to 
allow for independent manipulation of yeast auxotrophic complem ntati n markers and other features which 
5 did not have to be proximal to the promoters and coding sequ nces which would be inserted in the large mul- 
ti-cloning sfte. The new array of restriction sites was introduced by ligation of the oligonucleotide: 

d5 • -CCCGCGATCGATCACGT-3 • {SEQ ID NO: 15) 
w d3 • -TGCAGGGCCCTAGCTAG-5 • (SEQ ID HO: 16) 

into pARC303C cleaved with endonudease Aat II to form plasmid pARC300E, which contained the series of 
cloning sites, Aat II, Sma I, and Cla I at the former Aat II site. 

The yeast TRP 1 gene was Isolated as an 820 base pair fragment from pARC306B (See Example 4) with 
is the restriction endonudease Cla I. The 820 base pair Cla l-CIa I fragment was purified by agarose gel 
electrophoresis and ligated into plasmid pARC300E, which had been digested with Cla I. to create plasmid 
PARC300B. 

Plasmid pSOC81 1 was digested with Bam HI and Sap I to yield a 1 667 base pair coding sequence for the 
truncated HMG-CoA reductase gene which was purified by agarose gel purification. The 1667 base pair frag- 
20 ment was ligated to Bam HI, Hinc II restricted plasmid pARC300B to generate plasmid pARC300C. 

A source of an alternate promoter to the GAL 1-10 promoter which has been used to drive transcription of 
the truncated HMG-CoA reductase gene was desired. Use of the GAL 1-10 promoter requires that the yeast 
be cultured on galactose, an expensive substrate. In order to achieve high levels of transcription through the 
truncated HMG-CoA reductase gene during culture, growth in the presence of the much less expensive sub- 
25 strate, glucose, the promoter from the S. cerevbiae phosphoqlycerate kinase (PGK) gene was Isolated. The 
sequence of the gene is available from the literature, [Hitzeman, et al., Nud. Acid Res.. 10:7791-7808 (1 982)]. 

From the known sequence, an oligonucleotide probe sufficiently complementary to the gene to be used as 
a hybridization probe was synthesized: 

3, d5 ' -ATAAAGACATTCTTTTTAGATCTCTTGTAA-3 1 (SEQ ID HO: 17) 

This probe was labelled by T 4 polynucleotide kinase treatment In the presence of ^P-ATP, and used to screen 
a library of bacteriophage X subclones of the yeast genome, supplied by Maynard Olson (Washington University 
School of Medicine, Department of Genetics, SL Louis, Mo.). The gene was removed from this clone as an 
35 Eco Rl-Hind III fragment, and subcloned into M13mp18 f forming a new phage mARC127. 

To make the PGK promoter useful, the restriction site at the 5' end of the promoter was changed to an Eco 
RJ restriction site, and a Bgl II restriction site was introduced into the DNA fragment to the 3' side of the tran- 
scriptional start site. The Bgl II restriction site was introduced by using the oligonucleotide: 

40 d5 ' - AT AAAGACATTGTTTTT AG ATCTGTTGTAA- 3 • (SEQ ID NO: 17) , 

to mutagenize mARC127 according to the procedure of Kunkel et al. t Proc Natl. Acad. Sci. USA, 82:4778 
(1985). This resulted in the M13 phage designated mARC128. 

The Hind III site beyond the 5' end of the promoter region was converted to an Eco Rl site by cutting 
45 mARC128 with nuclease Hind III, treating with the Klenow fragment of DNA polymerase and the four deoxynuc- 
leotide triphosphates, followed by ligation in the presence of the oligonucleotide: 

d5'-GGAATTCC-3', 

which specifies an Eco Rl sits. The resulting M13 derivative was designated pARC306L. 

Plasmid pARC306L was digested with Eco Rl and Bgl II and a 1500 base pair fragment containing the PGK 
60 promoter, was purified by agarose gel electrophoresis and ligated into pARC300C, which had been restricted 
with Eco Rl and Bam HI, to produce plasmid pARC300D. 

EXAMPLE 6: Construction of Integrating Plasmids pARC300S and pARC300T 

55 Plasmkte pARC300S (See Figure 7) and pARC300T (See Figure 8) were constructed to incorporate a URA 

3 selectable marker into an integrating vector, in which a coding sequ nee for a truncated HMGIg ne was under 
the control of a PGK promoter. 

The only difference between plasmid pARC300S and pARC300T is the length of the PGK promoter driving 
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transcription of the truncated reductase coding sequence. A unique Eco RV restriction site found within th URA 
3 g ne allows the plasm Ida to be linearized and Integrated via homologous recombination into the chromosomal 
URA 3 gen . 

Th URA 3 g n from plasmid YEP24 (Botstein, at a!., Gene, 8:17-24 (1979)) was removed as an 1127 
5 base pair Eco Rl-Sma I nded restriction fragment and ligated into plasmid pUC19, cut with Eco Rl and Sma 
I to form a new plasmid LpARCLH550. An 1 141 base pair Hind III ended restriction fragment was removed from 
LpARCLH550 and ligated into Hind Ill-cleaved pUC1 8 to form plasmid LpARCLH553a. An 1 108 base pair Sma 
M-iind 111 restriction fragment was removed from LpARCLH553a and inserted into Sma l-Hind III cleaved 
M13mp19 nucleic acid to create a new phage nucleic acid pARC306K. The unique PstI site within the URA 3, 
10 gene was eliminated by mutagenesis with the oligonucleotide: 

d5 1 GATTTATCTTCGTTTCCTGCAAGTTTTTGTTC-3 • (SEQ ID NO: 18) 9 

using the method of Kunkel, LM.; et aL. Proa Naf I. Acad. ScL USA, 82:4778 (1985), to form plasmid 
15 DARC300Z. 

Plasmid pARC300Z was cut with Hind III, the ends filled in with the Klenow fragment of DNA polymerase 
and deoxynucleotide triphosphates, and the modified DARC3002 ligated with oligonucleotide d5'- 
CCCCGGGG-3', which specified a Sma I restriction site. This new M13 derivative, which contains the URA 3 
gene on a Sma I restriction fragment, was named plasmid pARC300Y. 

20 Plasmid pARC304A was constructed to provide a source of a modified URA 3 transcription terminator frag- 
ment which could then be Introduced at the 3' end of the coding sequence region in a yeast integrating trans- 
formation vector. The transcription terminator would function to improve mRNA stability in species transformed 
with integrating vectors containing coding sequences either lacking the terminator or having only weak ter- 
minator sequences. Improved mRNA stability could mean increased activity of the protein encoded by the cod- 

25 Ing sequence region. The terminator chosen was a region of the S. cerevlslae URA 3, which functions as a 
terminator, [Yarger et al. t Molecular and Cellular Biology , 6:1095 (1986)]. The terminator sequence was con- 
structed using 4 synthetic oligomers: 

so d5 ' -AGCTTCGAAGAACGAAGGAAGGAGCACAGACTTAG-3 1 

(SEQ ID NO: 19) 

d5 ' -ATTGGTATATATACGCATATTGCGGCCGCGGTAC-3 • 

(SEQ ID NO: 20) 

35 d5 ' -CGCGGCCGCAATATGCGTATATATAC-3 • 

(SEQ ID NO: 21) 



d5 • -CAATCTAAGTCTGTGCTCCTTCCTTCGTTCTTCGA-3 • 

(SEQ ID NO: 22) 

These oligomers were designed to provide Hind III and Kpn I restriction ends, respectively. The modified URA 
3 transcription terminator was assembled by Ngating all four oligomers to each other and digesting the ligation 
45 product with Hind III and Kpn I to produce ligatable Hind lll-Kpn I restriction ends. The 67 base pair fragment 
was isolated on a polyacrylamide gel, purified by electroeluting the DNA from the gel fragment, and then ligated 
into Hind lll-Kpn I restricted pUC118, (ATCC 37462). This construction created a new plasmid designated 
PARC304A. 

A 67 base pair Hind lll-Kpn I fragment containing a URA 3 transcription terminator was isolated from plas- 
ao mid pARC304A and ligated into Hind lll-Kpn I restricted pARC300E to form plasmid pARC300M. A truncated 
HMG-CoA reductase coding sequence was isolated as a 1 667 base pair Bam Hl-Ssp I fragmentfrom pSOC61 1 1 
(See Example 5) purified by agarose gel electrophoresis, and ligated to pARC300M, which had been restricted 
with Bam HI and Hinc II, to form plasmid pARC300R. 

A URA 3 complementing gene was removed from plasmid OARC300Y as an Xma I restriction fragment, 
55 and ligated into the Xma I site of pARC300R to create plasmid pARC300U. 

One ther change in th restriction sites ava3abte on the DNA specifying the PGK promoter was made. 
Th minimum DNA required to specify full PGK promoter activity has been determined, [Stanway, Nucleic Acids 
Research. 15:6855-6873 (1987)]. A new Eco RJ site was added to the DNA specifying the PGK promoter at a 
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region just past th minimal 5' required DNA. The site was added by utlizing the oligonudeotid : 
d5 1 -CTTT ATG AGGGT AACATGAATT CAAG AAGG- 3 1 (SEQ ID NO: 23), 

5 to mutagenize mARC1228 by the method of Kunkel et a)., Proc. Natl. Acad. Sci. USA, 82:4778 (1985). This 
new M13 derivative was designated pARC306M. 

A 1500 base pair phosphoglycerate kinase promoter (PGK) was removed from plasmid pARC306L (See 
Example 5) using Eco Rl and Bgl II restriction enzymes. The PGK promoter fragment was purified by agarose 
gel electrophoresis and Ifgatsd to Eco Rl and Bam HI restricted DARC300U, to form plasmid pARC300S. 

10 A shortened PGK promoter (555 base pair fragment) was Isolated from Eco Rl and Bgl II restricted plasmid 
pARC306M and inserted into Eco Rl-Bam HI digested plasmid pARC300U to form plasmid pARC300T. 

The only difference between plasmid pARC300S and plasmid pARC300T is the length of the PGK promoter 
driving transcription of the truncated reductase coding sequence. A unique Eco RV restriction site found within 
the URA 3 gene allows the plasmids to be linearized and integrated via homologous recombination Into the 

16 chromosomal URA 3 gene. 

EXAMPLE 7: Construction of Plasmid pARC304S 

Plasmid pARC304S (see Figure 9) was constructed to place the coding sequence of a truncated HMGI 
20 gene under the control of an ADH promoter. 

Plasmid pBR322 was digested with Eco Rl and Bam HI to yield a fragment containing the ADH1 promoter. 
The ADHI-containing fragment was ligated into plasmid pARC300U (See Example 6), which had been cut with 
Eco Rl and Bam HI, to form pARC304S. 

Plasmid pARC304S was deposited pursuant to the Budapest Treaty requirements with the American Type 
25 Culture Collection (ATCC) at 12301 Partdawn Drive, Rockvflle, MO 20852 U.S.A. on November 9, 1990 and 
was assigned Accession No. ATCC40916. 

EXAMPLE 8: Generation of Mutant S. cerevisiae ATC0402mu 

30 Mutant ATC0402mu was generated to have the GAL a t and trp1 phenotype as well as having defects in 
the expression of zymosterol-24-methyltransferase and ergosta-5,7 r 24(28)-tr1enol-22-dehydrogenase 
enzymes. These enzymes are respectively the erg6 and erg5 gene products of S. cerevisiae . 

An erg6 deficient mutant S. cerevisiae. M610-12B, obtained from the Yeast Genetic Stock Center (Univ. 
of California, Berkeley, CA), was crossed with an erg5 deficient mutant S. cerevisiae (obtained as a gift from 
35 Dr. Leo Parks, North Carolina State Univ., Raleigh, NC) to produce an erg6-erg5 double mutant, ATC0403mu. 
ATC0403mu was then crossed with wild-type S. cerevisiae , DBY745 (Yeast Genetic Stock Center) to pro- 
duce mutant ATC0402mu. 

Mutant ATC0402mu was deposited pursuant to the Budapest Treaty Requirements with the American Type 
Culture Collection (ATCC) at 12301 Partdawn Drive, RockvQIe MD 20852 U.S.A. on November 9, 1990, and 
40 was assigned Accession No. ATCC 74027. 

EXAMPLE 9: Generation of Transformed Mutants ATC1500cp, ATC1502, ATC1503, ATC1551 and 
ATC2401 

45 Several mutants were generated from the bansformatkm of ATC0402mu using the method of Example 1 , 
with various expression systems (plasmids) containing HMG-CcA reductase coding sequences under the tran- 
scriptional control of various promoters. The introduction into ATC0402mu of plasmid pSOC1 06ARC, construc- 
ted according to the method of Example 3, created ATC1503. 

The introduction into ATC0402mu of plasmid pSOC725ARC, constructed according to the method of 
so Example 2, created ATC2401 mu. 

The Introduction into ATCO402mu of plasmid pARC306E, constructed according to the method of Example 

4, created ATC1502. 

The introduction into ATC0402mu of plasmid pARC300D, constructed according to the method of Example 

5, created ATC1500cp. 

55 Th creation of strain ATC1551 required the generation of a ura3 derivath/ of strain ATC1500cp, which 
has no auxotrophic markers. The ura3 derivative was created by transforming ATC1500cp with a mutagenic 
oligonucleotide using the m thod of MoerscheJI t al. Proc Natl. Acad. Sci. USA , 85:524-528 (1988)]. The 
sequence of th mutagenic oligonucleotide used is: 



19 



EP0486290A2 



5 • -GCCAA6TAGTTTTT ACTCTTCAAG ACAGATAATTTGCTGACA** 3 • 

(SEQ ID NO: 24) 

5 Mutated yeast ceils were selected by their resistance to 5'-fluoro-orotic acid (5-FOA), as described in 

Ausubei et al., (eds.). Current Protocols In Molecular Biology . John Wiley and Sons, New York, (1989), and 
screened for their inability to grow in the absence of uraci. The resulting ura3 strain was designated ATC01 35rc 
Strain ATC031 5rc was then transformed with plasm id pARC304S, constructed according to the method of 
Example 7, to create strain ATC1551. 

10 Transformation of strain ATC031 5rc with plasmid pARC304S of the present invention resulted In the great- 
est degree of sterol accumulation. Further, the growth of a transformed ATC0315rc mutant under conditions 
of restricted aeration as compared to usual culture conditions, resulted in an increased accumulation of 
squalene relative to other sterols as well as an increase In the total accumulation of squalene and total sterols. 
Mutant ATC0315rcwa8 deposited pursuant to the Budapest Treaty Requirements with the American Type 

16 Culture Collection (ATCC) at 12301 Parklawn Drive, Rockville, MD 20852 U.SA on September 16, 1991 , and 
was assigned Accession No. ATCC 74090. 

EXAMPLE 10: Generation of Mutant S. cerevisiae ATC61 1 8, ATC0501 and ATC61 19 

20 Mutants were obtained using an inducible *TY1-neo" transposon as the mutagenic agent, (Boeke, et ai. 

Science. 239:280-282 (1989)]. 

Wild type S. cerevisiae JB51 6 was transformed with plasmid pJEF1 105 [Boeke etal., Science. 239:280-282 

(1989)], containing an inducible GALTYIneo expression cassette, and plasmid pCGS286, containing a GAL- 

:lacZ control. The transformed yeast were then spread onto petri dishes containing two kinds of Xgal 
25 chromogenlc Indicator dye: synthetic dextrose (SD) agar media minus uracil and synthetic galactose (SG) agar 

media minus uracil. Yeast trartsfbrmed with plasmid pJEF1 105 appeared normal on dextrose but smaller than 

untransfbrmed control yeast on galactose media. 

The stability of plasmid pJEF1105 was confirmed by shuttling into E. coli for propagation and restriction 

analysis. 

30 Once plasmid pJEF1 105- transformed yeasts were shown to be competent, the pJEF1105 transforrnants 
were placed on SG-minus uracil agar at a density of no more than 1000 transforrnants per petri plate. The plates 
were incubated at 22°C for five days, during which the mutagenic transposition of the plasmid borne TYI-neo 
occurred. The transforrnants were then replica plated onto another SG-m In us uracil plate and Incubated another 
five days. Those colonies that survived were replica plated onto YEPD agar plates containing 100 u hits/ml of 

35 nystatin to select for sterol production and 100 units/ml of G418 (a neomycin analog) to select for the "neo" 
phenotype. Transforrnants that were both nystatin and G418 resistant were evaluated for sterol content and 
distribution using gas chromatographic and mass spectrograph Ic analysts and then classified as to the specific 
sterol biosynthetic step affected by the mutation. 

A yeast deficient in the enzyme episteroKS-dehydrogenase (the erg3 gene product) was isolated and desig- 

40 nated ATC6118. 

A yeast deficient in the enzyme zymo8terol-24-methyl transferase (erg6) was isolated from plasmid 
pJEF1 105 mutated yeast DBY745 (Yeast Genetic Stock Center) and designated ATC0501 . 

ATC0501 was crossed with ATC6118 to produce an erg3-erg6 double mutant designated ATC6119. 

45 EXAMPLE 11: Generation of Transformed Mutant S. cerevisiae ATC21 00, ATC2104 and ATC2109 

Following the method of Example 1 , the introduction into ATC61 1 9 of plasmids pARC300S and pARC300T, 
constructed according to the method of Example 6, created ATC2100 and ATC2104 respectively, whereas the 
introduction into ATC6118 of plasmid pARC300S created ATC2109. 

50 

EXAMPLE 12: Generation of Mutant S. cerevisiae ATC4124 

ATC4124 (Yeast Genetic Stock Centers) was generated by crossing ATC0403mu with YNN281 (Yeast 
Genetic Stock Centers) and selecting for the desired mutation. The resulting s eg reg ants were then backcrossed 
55 twice with YNN281. 

Resulting ATC4124 had a defect In the expression of cholesta-5J,24(28)-b1enol-22-dehydrogenase (the 
ergSgen product). 
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EXAMPLE 13: Generation of Transformed Mutant S. cevevisias ATC2107 and ATCE2108 

Following the method of Example 1 , introduction into ATC41 24 of plasmid pARC306E, constructed accord- 
ing to the method of Example 4, created ATC2107 and ATC2108. 

5 

EXAMPLE 14: HMG-CoA Reductase Activity tn Mutant and Transformed Yeast 

HMO-CoA reductase activity was measured In non-transformed and transformed erg5-erg6 mutant yeasts. 
About 0.2 mi of 50 mM potassium phosphate buffer, pH 6.8, containing 125 mM sucrose, 20 mM EOTA 

io and 100 mM KCI was combined with 10 mM DTT (freshly made), 1 mM NAOPH, enzyme preparation and water 
to make an enzyme solution of about 0.475 ml final volume. The enzyme solution was preincubated at 37°C 
for 20 minutes and the incubation reaction initiated with the addition of 100 ^M "C-HMG-CoA (60,000 dpm in 
0.025 ml). After five minutes, the reaction was stopped by the addition of 50 uJ of HCi (1:1) and further incubation 
at 37°C for 30 minutes to lactonize the product The product, mevalonolactone, was separated from HMG on 

16 an anion exchanger AGI-X8 (Bio-Rad) and the radioactivity associated with the product was counted in a sci- 
ntillation counter. The results are shown in Table 5, below. The copy number of an added structural gene encod- 
ing a polypeptide having HMG-CoA reductase activity was estimated using standard procedures well known 
to those of skOI In the transformation art 
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Mutant 



TABLE 5 

Estimated 

Copy # of Added 

Structural Gene 



25 



30 



35 



Non-transformed 
ATC0402mu 

Transferase 

ATC1503 

ATClSOOcp 

ATC1512 



1,2 
5,6 
8,9 



Specific Activity 
HMG-CoA Reductase 
(mmols/min/mg dry vt) 



0.52 



0.69 
1.33 
1.01 



EXAMPLE 15. Squalene and Sterol Accumulation in Yeast 

40 

The accumulation of squalene and specific sterols was determined in non-transformed and transformed 
mutant yeast cultures. 

Fifty to one hundred mg of lyophilized yeast cells were extracted/saponified in 10 ml of an ethanol/water 
(2:1) solution containing 20 percent (w/v) KOH for two hours at 80°C. Extracts were partially neutralized with 
45 10 ml 1N HCI and extracted twice with 15 ml rv-heptane. The sterol-containing heptane fractions were evapo- 
rated to dryness under a stream of N 2 and resuspended to an appropriate volume with n-heptane containing 
an internal standard (5-aJpha-cholestane). 

The resuspended samples were analyzed for sterol accumulation by capillary GC with flame Ionization 
detection. 

so Table 6 contains summary data for non-transformed (control) and transformed mutants having a single 
defect (erg3 v erg5) in the expression of sterol biosynthetic pathway enzymes. 

Table 7 contains summary data for non-transformed (control) and transformed mutants having double 
defects (erg3~erg8, erg5-erg6) in the expression of sterol biosynthetic pathway enzymes. 

In b th Table 6 and Table 7, the transformants were all mad by transforming th control mutant having 
55 the sam ergmutati n. 

Sterol levels are expressed as a percent of the dry blomass. 
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TABLE 6 
ERG3 Mutants 

Percent of Blomass 



a. Squalene 

b. ergosta-8 , 22-dienol 

c. ergosta-7 , 22-dienol 
d • ergosta-8-enol 
e . ergosta-7-enol 



Non-transformed 
ATC6118 

N.D." 
0.31 
0.66 
0.27 
0.63 

ERGS Mutants 

Percent of Blomass 



ATC2109 
0.26 
1.08 
1.64 
0.42 
0.72 



a. Squalene 

b. Zymosterol 

c. ergosta-5,7, 
24(28) -trienol and 



Non- trans formed 

ATC4124 

N.D. 

0.05 

0,17 



Transformed; 

ATC2107 ATC2108 
1.10 0.49 
0.25 0.25 
1.75 1.19 



ergosta-5 # 7-dienol 



Hot Detectable 
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Th above data Qlustrat that transformati n of mutants having a slngl defect in the xpression of sterol 
blosynthetic pathway enzymes resulted In an Increased accumulation of squalene and specific sterols (See 
Table 6). 

Relativ to a n n-transformed erg3 mutant, rg3 mutants transformed with a plasmid vector useful in th 
5 present invention overaccumulated squalene, ergosta-8,22-dienol , ergosta-7,22-di nol, ergosta-8-enol and 
ergosta-7-enoI. 

Relative to a non-transformed erg5 mutant, erg5 mutants transformed with a plasmid vector useful in the 
present invention overaccumulated squalene, zymosterol, and a mixture of ergosta-5 l 7,24(28)-trienoi and 
ergosta-5,7-dienoi. 

10 Similarly, transformation of mutants having double defects In the sterol blosynthetic pathway enzymes led 
to the overaccumufation of squalene and specific sterols. 

Relative to a non-transformed erg3-erg6 mutant, erg3-erg6 mutants transformed with a plasmid vector use- 
ful in the present invention overaccumulated squalene, zymosterol and choJesta-7,24-dienol. 

Relative to a non-transformed erg5-erg6 mutant, erg&«rg6 double mutants transformed with a plasmid vec- 
15 tor useful in the present invention overaccumulated squalene, zymosterol, cholesta-5,7 l 24-trienol and choles- 
ta-7,24-dienol. 

The greatest increases in squalene and specific sterol accumulation are seen when erg5-erg6 mutant 
ATC0315rc Is transformed with plasmid vector pARC304S (mutant ATC1551), as described In Example 9. 
Further, the data show that species ATC0402mu, the grandparent strain of mutant ATC0315rc, has elevated 
20 levels of sterols relative to either an erg5 or an erg6 single mutant (see Table 6). 

EXAMPLE: 16 Induction of Squalene Accumulation in Yeast Transformant ATC1551 

It is generally known that restricted aeration induces squalene accumulation at the expense of sterols in 

25 yeast cultures. This occurs because oxygen is required for the enzymatic conversion of squalene to squalene 
monoepoxlde, which in turn is converted into lanosterol and other yeast sterols. 

To determine if high levels of squalene accumulation could be induced in transformants, cultures of 
ATC1551 were grown under varying degrees of aeration by varying the volume (and hence the surface-to-vo- 
lume ratio) of growth medium In shake-flask cultures and assaying squalene and total sterol at one day Intervals 

30 over a period of four days. 

Triplicate 250 ml baffled shake-flasks respectively containing 50, 100, 150 and 200 ml of YEP/2 percent 
glucose growth medium were inoculated with two ml of a 24 hour liquid culture of ATC1551 grown on a rotary 
shaker (200 rpm) at 30°C. Fifty ml culture allquots were harvested by centrtfugatlon after one, two, three and 
four days growth under the aforementioned conditions and lyophBized overnight 

35 . To insure efficient squalene extraction, approximately 1 00 mg of each lyophQized sample was agitated for 
10 minutes in 15 ml conical tubes containing a suitable quantity of glass beads and a small amount of water. 
The disrupted cell material was then extracted three successive times with 10 ml of 100 percent ethanol with 
vigorous agitation for one hour at 80°C. The combined ethanol extracts were reduced to dryness under a stream 
of nitrogen and redissolved in two ml of heptane containing 5a-cholestane as the internal standard. GC anal- 

40 yses of squalene were conducted as previously described. 

For total sterol analyses, the same samples were reduced under a stream of nitrogen and saponified In 5 
rrd of 95 percent ethanol/water solution containing 0.3 M KOH for one hour at 80°C. An equivalent volume of 
water was added and the samples were twice extracted with 10 ml aliquots of heptane. The heptane extracts 
were combined, reduced to a suitable volume and analyzed by GC. 

45 The results are shown in Table 8 (data averaged from triplicate cultures and expressed as percent of dry 
biomass). 
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Tine to 



10 



Day 1 
squalene 
total sterol 



4.25 
9.40 



Xflblfi-ft 

Crovth Medium Volume 

50 ml loo ml 150 ml 

Percent of Drv Biomass 



5.40 
9.52 



3.61 
6.81 



200 ml 



2.63 
5.46 



16 



20 



Day 2 
squalene 
total sterol 

Day 3 
squalene 
total sterol 

Day 4 
squalene 
total sterol 



4.78 
8.29 



4.75 
7.96 



4.03 
7.09 



6.43 
6.44 



8.82 
7.65 



7.08 
8.62 



11.89 
3.72 



13.54 
4.36 



15.99 
5.10 



8.32 
2.98 



13.38 
4.19 



14.72 
3.39 



25 The data show that in transformed, erg5-erg6 mutants, squalene is preferentially accumulated as compared 
to total sterol by restricting the level of aeration as compared to usual culture conditions (50 mis of growth 
medium), particularly after more than about one day of culture. The data also show that restricting the level of 
aeration (lowering the surface-to-volume ratio) also increases the sum total of squalene and total sterol accumu- 
lation, after more than about two days of cuJure. 

30 Although the present invention has now been described in terms of certain preferred embodiments, and 
exemplified with respect thereto, one skilled in the art will readily appreciate that various modifications, 
changes, omissions and substitutions may be made without departing from the spirit thereof. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 



(i) APPLICANTS Saunders, Court A. 

Wolf, Fred R. 
Mukharji, Indrani 

(ii) TITLE OF INVENTION: A Method and Composition for Increasing 
the Accumulation of Squalene and Specific Sterols in 
Yeast 



(iii) NUMBER OF SEQUENCES: 24 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Amoco Corp., Patents and Licensing Dept. 

(B) STREET: 200 East Randolph St. 

(C) CITY: Chicago 

(D) STATE: Illinois 

(E) COUNTRY: USA 

(F) ZIP: 60680-0703 

(V) COMPUTER READABLE FORM: 

(A) ME DIUM T YPES Floppy disk 

(B) COMPUTERS IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Pa tent In Release #1.0, Version #1.25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 07/613 , 380 

(B) FILING DATE: November 15, 1990 

(C) CLASSIFICATION: 

(viii) ATTORNEY/ AGENT INFORMATION: 

(A) NAME: Galloway, Norvall B« 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 312 856-7180 

(B) TELEFAX: 312 856-4972 



(2) INFORMATION FOR SEQ ID NO:lt 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3360 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 121.. 3282 

(Xi) SEQUENCE DESCRIPTION: SEQ 10 N0:1: 

TTTATTAACT TATTTTTTTC TTCTTTCTAC CCAATTCTAG TCAGGAAAAC ACTAAGGGCT 60 

GGAACATAGT GTATCATTGT CTAATTGTTG ATACAAACTA CATAAATACA TAAAACAAGC 120 

ATG CCG CCG CTA TTC AAG G6A CTC AAA CAG ATG GCA AAG CCA ATT GCC 168 

Met Pro Pro Leu Phe Lys Gly Leu Lys Gin Met Ala Lys Pro lie Ala 
15 10 15 

TAT GTT TCA AGA TTT TCG GCG AAA CGA CCA ATT CAT ATA ATA CTT TTT 216 
Tyr val Ser Arg Phe Ser Ala Lys Arg Pro lie His He He Leu Phe 

20 20 25 30 

TCT CTA ATC ATA TCC GCA TTC GCT TAT CTA TCC GTC ATT CAG TAT TAC 264 
Ser Leu He He Ser Ala Phe Ala Tyr Leu Ser Val He Gin Tyr Tyr 
35 40 45 



16 



25 



30 



TTC AAT GGT TGG CAA CTA GAT TCA AAT AGT GTT TTT GAA ACT GCT CCA 312 

Phe Asn Gly Trp Gin Leu Asp Ser Asn Ser Val Phe Glu Thr Ala Pro 
50 55 60 

AAT AAA GAC TCC AAC ACT CTA TTT CAA GAA TGT TCC CAT TAC TAC AGA 360 

Asn Lys Asp Ser Asn Thr Leu Phe Gin Glu Cya Ser His Tyr Tyr Arg 

65 70 75 80 

GAT TCC TCT CTA GAT GGT TGG GTA TCA ATC ACC GCG CAT GAA GCT AGT 408 

Asp Ser Ser Leu Asp Gly Trp Val Ser He Thr Ala His Glu Ala Ser 

85 90 95 

GAG TTA CCA GCC CCA CAC CAT TAC TAT CTA TTA AAC CTG AAC TTC AAT 456 
Glu Leu Pro Ala Pro His His Tyr Tyr Leu Leu Asn Leu Asn Phe Asn 

100 105 110 

AGT CCT AAT GAA ACT GAC TCC ATT CCA GAA CTA GCT AAC ACG GTT TTT 504 
Ser Pro Asn Glu Thr Asp Ser He Pro Glu Leu Ala Asn Thr Val Phe 
115 120 125 

GAG AAA GAT AAT ACA AAA TAT ATT CTG CAA GAA GAT CTC AGT GTT TCC 552 
Glu Lys Asp Asn Thr Lys Tyr He Leu Gin Glu Asp Leu Ser Val Ser 
130 135 140 

AAA GAA ATT TCT TCT ACT GAT GGA ACG AAA TGG AGG TTA AGA AGT GAC 600 
^ Lys Glu He Ser Ser Thr Asp Gly Thr Lys Trp Arg Leu Arg Ser Asp 

145 150 155 160 



35 



40 



60 



56 



27 



EP0486 290A2 



10 



16 



AGA AAA ACT CTT TTC GAC GTA AAG ACG TTA CCA TAT TCT CTC TAC GAT 648 
Arg Lys Ser Leu Phe Aap val Lys Thr Leu Ala Tyr Ser Leu Tyr Asp 

165 170 175 

GTA TTT TCA CAA AAT GTA ACC CAA GCA GAC COG TTT GAC GTC CTT ATT 696 
Val Phe Ser Glu Asn Val Thr Gin Ala Asp Pro Phe Asp Val Leu lie 

1*° 185 , 190 

ATG CTT ACT GCC TAC CTA ATG ATG TTC TAC ACC ATA TTC GGC CTC TTC 744 
Mat Val Thr Ala Tyr Leu Met Met Phe Tyr Thr He Phe Gly Leu Phe 
195 200 205 

AAT GAC ATG AGG AAG ACC GGG TCA AAT TTT TGG TTC AGC GCC TCT ACA 792 
Asn Asp Met Arg Lys Thr Gly Ser Asn Phe Trp Leu Ser Ala Ser Thr 
210 215 220 

GTG GTC AAT TCT GCA TCA TCA CTT TTC TTA GCA TTG TAT GTC ACC CAA 840 
20 Val Val Asn Ser Ala Ser Ser Leu Phe Leu Ala Leu Tyr Val Thr Gin 
225 230 235 240 

TGT ATT CTA GGC AAA GAA GTT TCC GCA TTA ACT CTT TTT GAA GGT TTG 888 
Cys He Leu Gly Lys Glu Val Ser Ala Leu Thr Leu Phe Glu Gly Leu 

245 250 255 

25 CCT TTC ATT GTA CTT CTT CTT GGT TTC AAG CAC AAA ATC AAC ATT GCC 936 
Pro Phe He Val Val Val Val Gly Phe Lys His Lys He Lys He Ala 

260 265 270 

CAG TAT GCC CTC GAC AAA TTT GAA AGA GTC GGT TTA TCT AAA AGG ATT 984 
Gin Tyr Ala Leu Glu Lys Phe Glu Arg Val Gly Leu Ser Lys Arg He 

30 275 280 285 

ACT ACC GAT GAA ATC GTT TTT GAA TCC CTC AGC GAA GAG GGT GGT CGT 1032 
Thr Thr Asp Glu He Val Phe Glu Ser Val Ser Glu Glu Gly Gly Arg 

290 295 300 



35 TTG ATT CAA GAC CAT TTC CTT TGT ATT TTT GCC TTT ATC GGA TGC TCT 
Leu He Gin Asp His Leu Leu Cys He Phe Ala Phe He Gly Cys Ser 
305 310 315 320 



40 



SO 



1080 



ATG TAT GOT CAC CAA TTG AAG ACT TTG ACA AAC TTC TGC ATA TTA TCA 1128 
Met Tyr Ala His Gin Leu Lys Thr Leu Thr Asn Phe Cys He Leu Ser 

325 330 335 

GCA TTT ATC CTA ATT TTT GAA TTG ATT TTA ACT CCT ACA TTT TAT TCT 1176 
Ala Phe He Leu He Phe Glu Leu He Leu Thr Pro Thr Phe Tyr Ser 

340 345 350 

CCT ATC TTA GCC CTT AGA CTG GAA ATG AAT GTT ATC CAC AGA TCT ACT 1224 
45 Ala He Leu Ala Leu Arg Leu Glu Met Asn Val He His Arg Ser Thr 

355 360 365 
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ATT ATC AAG CAA ACA TTA GAA GAA GAC GGT GTT GTT CCA TCT ACA CCA 1272 
He He Lys Gin Thr Lou Glu Gla Asp Gly Val Val Pro Ser Thr Ala 
370 375 380 

10 ACA ATC ATT TCT AAA GCA GAA AAG AAA TCC GTA TCT TCT TTC TTA AAT 1320 

Arg lie lie Ser Lye Ale Glu Lye Lye Ser Val Ser Ser Phe Leu Asn 
3*3 390 395 400 

CTC ACT GTG GTT GTC ATT ATC ATG AAA CTC TCT GTC ATA CTG TTG TTT 1368 
Leu Ser Val Val Val .lie lie Met Lya Leu Ser Val lie Leu Leu Phe 
f5 *05 410 415 

GTT TTC ATC AAC TTT TAT AAC TTT GGT GCA AAT TGG GTC AAT GAT GCC 1416 
Val Phe He Asn Phe Tyr Asn Phe Gly Ala Asn Trp Val Asn Asp Ala 

420 425 430 

TTC AAT TCA TTG TAC TTC GAT AAG GAA CGT GTT TCT CTA CCA GAT TTT 1464 
20 Phe Asn Ser Leu Tyr Phe Asp Lys Glu Arg Val Ser Leu Pro Asp Phe 

435 440 445 

ATT ACC TCG AAT GCC TCT GAA AAC TTT AAA GAG CAA GCT ATT GTT AGT 1512 
He Thr Ser Asn Ala Ser Glu Asn Phe Lys Glu Gin Ala He Val Ser 
450 455 460 

25 GTC ACC CCA TTA TTA TAT TAC AAA CCC ATT AAG TCC TAC CAA CGC ATT 1560 

Val Thr Pro Leu Leu Tyr Tyr Lys Pro He Lys Ser Tyr Gin Arg He 

465 470 475 480 

GAG GAT ATG GTT CTT CTA . TTG CTT CGT AAT GTC AGT GTT GCC ATT CGT 1608 
Glu Asp Met Val Leu Leu Leu. Leu Arg Asn Val Ser Val Ala He Arg 

30 485 490 495 

GAT AGG TTC GTC AGT AAA TTA GTT CTT TCC GCC TTA GTA TCC AGT GCT 1656 
Asp Arg Phe Val Ser Lys Leu Val Leu Ser Ala Leu Val Cys Ser Ala 

500 505 510 



35 



40 



45 



60 



55 



CTC ATC AAT GTC TAT TTA TTG AAT GCT GCT AGA ATT CAT ACC AGT TAT 1704 

Val He Asn Val Tyr Leu Leu Asn Ala Ala Arg He His Thr Ser Tyr 

515 520 525 

ACT GCA GAC CAA TTG GTG AAA ACT GAA GTC ACC AAG AAC TCT TTT ACT 1752 
Thr Ala Asp Gin Leu Val Lys Thr Glu Val Thr Lys Lys Ser Phe Thr 
530 535 540 

GCT CCT GTA CAA AAG GCT TCT ACA CCA CTT TTA ACC AAT AAA ACA GTC 1800 
Ala Pro Val Gin Lys Ala Ser Thr Pro Val Leu Thr Asn Lys Thr Val 
545 550 555 560 

ATT TCT CCA TCG AAA GTC AAA AGT TTA TCA TCT GCG CAA TCG AGC TCA 1848 
He Ser Gly Ser Lys Val Lys Ser Leu Ser Ser Ala Gin Ser Ser Ser 

565 570 575 
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20 



25 



TCA GGA CCT TCA TCA TCT ACT GAG GAA GAT GAT TCC CGC GAT ATT GAA 1896 
Ser GXy Pro Ser Ser Ser Ser Glu Glu Asp Asp Ser Arg Asp lie Glu 

580 585 590 

10 AGO TTG GAT AAG AAA ATA OCT CCT TTA GAA GAA TTA GAA GCA TTA TTA 1944 

Ser Leu Asp Lys Lys He Arg Pro Leu Glu Glu Leu Glu Ala Leu 1am 

595 600 605 

ACT ACT GGA AAT ACA AAA GAA TTG AAG AAC AAA GAG GTC GCT GCC TTG 1992 
Ser Ser Gly Asn Thr Lys Gin Lsu Lys Asn Lys Glu Val Ala Ala Leu 

1S 610 615 620 

GTT ATT CAC GGT AAG TTA CCT TTG TAC GCT TTG GAG AAA AAA TTA GGT 2040 
Val lie His Gly Lys Leu Pro Leu Tyr Ala Leu Glu Lys Lys Leu Gly 
625 630 635 640 

GAT ACT ACG AGA GCG GTT GCG GTA CCT AGG AAG GCT CTT TCA ATT TTG 2088 

Asp Thr Thr Arg Ala Val Ala Val Arg Arg Lys Ala Leu Ser lie Leu 

645 650 655 

GCA GAA GCT CCT GTA TTA GCA TCT GAT CGT TTA CCA TAT AAA AAT TAT 2136 

Ala Glu Ala Pro Val Leu Ala Ser Asp Arg Leu Pro Tyr Lys Asn Tyr 

660 665 670 

GAC TAC GAC CGC GTA TTT GGC GCT TGT TGT GAA AAT GTT ATA GGT TAC 2184 
Asp Tyr Asp Arg Val Phe Gly Ala Cys Cys Glu Asn Val lie Gly Tyr 
675 680 685 

ATO CCT TTG CCC GTT GGT GTT ATA GGC CCC TTG GTT ATC GAT GGT ACA 2232 
Met Pro Leu Pro Val Gly Val lie Gly Pro Leu Val He Asp Gly Thr 

30 $90 695 7 0 o 

TCT TAT CAT ATA CCA ATG GCA ACT ACA GAG GGT TGT . TTG GTA GCT TCT 2280 

Ser Tyr His He Pro Met Ala Thr Thr Glu Gly Cys Leu Val Ala Ser 

705 710 715 720 

15 GCC ATG CGT GGC TGT AAG GCA ATC AAT GCT GGC GGT GGT GCA ACA ACT 2328 

Ala Met Arg Gly Cys Lys Ala He Asn Ala Gly Gly Gly Ala Thr Thr 

725 730 735 

GTT TTA ACT AAG GAT GGT ATG ACA AGA GGC CCA GTA GTC CGT TTC CCA 2376 
Val Leu Thr Lys Asp Gly Met Thr Arg Gly Pro Val Val Arg Phe Pro 

740 745 750 

40 

ACT TTG AAA AGA TCT GGT GCC TGT AAG ATA TGG TTA GAC TCA GAA GAG 2424 

Thr Leu Lys Arg Ser GXy Ala Cys Lys lie Trp Leu Asp Ser Glu Glu 

755 760 765 

GGA GAA AAC GCA ATT AAA AAA GCT TTT AAC TCT ACA TCA AGA TTT GCA 2472 
45 Gly Gin Asn Ala He Lys Lys Ala Phe Asn Ser Thr Ser Arg Phe Ala 

770 775 780 
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CGT CTG CAA CAT ATT CAA ACT TGT CTA GCA GGA GAT TTA CTC TTC ATG 
Arg Leu Gin His He Gin Thr Cy» Leu Ala Gly Asp Leu Leu Fhe Met 

785 790 795 800 

AGA TTT AGA ACA ACT ACT CGT GAC GCA ATG GGT ATG AAT ATG ATT TCT 
Arg Phe Arg Thr Thr Thr Gly Asp Ala Met Gly Ket Asn Het He ser 

805 810 815 

AAA GGT GTC GAA TAC TCA TTA AAG CAA ATG GTA GAA GAG TAT GGC TGG 
Lys Gly Val Glu Tyr Ser Leu Lys Gin Het Val Glu Glu Tyr Gly Tzp 

820 825 830 

GAA GAT ATG GAG GTT GTC TCC GTT TCT GGT AAC TAC TGT ACC GAC AAA 

Glu Asp Met Glu Val Val Ser Val Ser Gly Asn Tyr Cys Thr Asp Lys 

835 840 845 

AAA CCA GCT GCC ATC AAC TGG ATC GAA GGT CGT GGT AAG AGT GTC GTC 
Lys Pro Ala Ala He Asn Trp He Glu Gly Arg Gly Lys Ser Val Val 
850 855 860 

GCA GAA GCT ACT ATT CCT GGT GAT GTT GTC AGA AAA GTG TTA AAA AGT 
Ala Glu Ala Thr He Pro Gly Asp Val Val Arg Lys Val Leu Lys Ser 

865 870 875 880 

GAT GTT TCC GCA TTG GTT GAG TTG AAC ATT GCT AAG AAT TTG GTT GGA 
Asp Val Ser Ala Leu Val Glu Leu Asn He Ala Lys Asn Leu Val Gly 

885 890 895 

TCT GCA ATG GCT GGG TCT GTT GGT GGA TTT AAC GCA CAT GCA GCT AAT 
Ser Ala Met Ala Gly Ser Val Gly Gly Fhe Asn Ala His Ala Ala Asn 

900 905 910 

TTA GTG ACA GCT GTT TTC TTG GCA TTA GCA CAA GAT CCT GCA CAA AAT 
Leu Val Thr Ala Val Phe Leu Ala Leu Gly Gin Asp Pro Ala Gin Asn 
915 920 925 

GTT GAA AGT TCC AAC TGT ATA ACA TTG ATG AAA GAA GTG GAC GGT GAT 
Val Glu Ser Ser Asn Cys He Thr Leu Met Lys Glu val Asp Gly Asp 

930 935 940 

TTG AGA ATT TCC CTA TCC ATG CCA TCC ATC GAA GTA GGT ACC ATC GGT 
Leu Arg He Ser Val Ser Met Pro Ser He Glu Val Gly Thr He Gly 
945 950 955 960 

GGT GGT ACT GTT CTA GAA CCA CAA GGT GCC ATG TTG GAC TTA TTA GGT 
Gly Gly Thr Val Leu Glu Pro Glh Gly Ala Met Leu Asp Leu Leu Gly 

965 970 975 

GTA ACA GCC CCC CAT GCT ACC GCT CCT GGT ACC AAC GCA CGT CAA TTA 
Val Arg Gly Pro His Ala Thr Ala Pro Gly Thr Asn Ala Arg Gin Leu 

980 985 990 
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GCA AGA ATA CTT GCC TGT GCC CTC TTC GCA GGT GAA TTA TCC TTA TOT 3144 
Ala Arg lie Val Ala Cys Ala Val Leu Ala Gly Glu Leu Sar Lau Cya 
995 1000 1005 

GCT GCC CTA GCA GCC GGC CAT TTG GTT CAA AGT CAT ATG ACC CAC AAC 3192 
Ala Ala Lau Ala Ala Gly Hia Lau Val Gin Sar Hia Hat Thr Hia Asn 

1010 1015 1020 

AGO AAA CCT GCT GAA CCA AGA AAA CCT AAC AAT TTG GAC GCC ACT GAT 3240 

Arg Lys Pro Ala Glu Fro Thr Lya Pro Aan Aan Lau Aap Ala Thr Asp 

1025 1030 1035 1040 

ATA AAT CGT TTG AAA GAT GGG TCC GTC ACC TGC ATT AAA TCC 3282 

Ila Asn Arg Lau Lys Asp Gly Bar Val Thr Cys He Lys Ser 

1045 1050 
TAAACTTAGT CATACGTCAT TGGTATTCTC TTGAAAAAGA AGCACAACAG CACCATGTGT 3342 

TACGTAAAAT ATTTACTT 33 60 

• 

(2) INFORMATION FOR SEQ XD NO: 2: 

„ (i) SEQUENCE CHARACTERISTICS: 

25 (A) LENGTH: 1054 amino acids 

(B) TYPE: amino acid 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

30 (xi) SEQUENCE DESCRIPTION : SEQ 10 NO: 2: 

Mat Pro Pro Leu Phe Lys Gly Leu Lys Gin Met Ala Lys Pro lie Ala 
15 10 15 



20 



35 



40 



45 



Tyr Val Ser Arg Phe Sar Ala Lys Arg Pro He Bis He lie Leu Phe 

20 25 30 

Ser Leu He Ho Ser Ala Phe Ala Tyr Leu Ser Val He Gin Tyr Tyr 
35 40 45 

Phe Asn Gly Trp Gin Leu Asp Ser Asn Ser val Phe Glu Thr Ala Pro 

50 55 60 

Asn Lys Asp Ser Asn Thr Leu Phe Gin Glu Cys Ser His Tyr Tyr Arg 

65 70 75 80 

Asp Ser Ser Leu Asp Gly Trp Val Ser He Thr Ala His Glu Ala Ser 

85 90 95 

Glu Leu Pro Ala Pro His His Tyr Tyr Leu Leu Asn Leu Asn Phe Asn 

100 105 110 
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16 



20 



25 



30 



35 



40 



45 



Pro Asn Glu Thr Asp 
115 



Ser II Pr Glu Leu Ala Asn Thr Val Phe 
120 125 



Glu Lys Asp Asn Thr Lys Tyr lis Leu Gin Glu Asp Leu Ser Val Ser 

130 135 140 



Lys Glu lie Ser 
145 



Thr Asp Gly Thr Lys Trp Arg Leu Arg 
150 155 



Ser Asp 
160 



Arg Lys Ser Leu Phe Asp Val Lys Thr Leu Ala Tyr Ser Leu Tyr Asp 

165 170 175 



Val Phe Ser Glu Asn Val Thr Gin Ala Asp Pro Phe Asp 

180 185 



Val Leu He 
190 



Met Val Thr Ala Tyr Leu Met Met Phe Tyr Thr He Phe Gly Leu Phe 

195 200 205 



Asn Asp Met 
210 

Val Val Asn 
225 



Lys Thr Gly 
215 



Ala Ser 
230 



Asn Phe Trp Leu 

220 



Ser Ala Ser Thr 



Cys He Leu Gly Lys Glu Val 

245 



Leu Phe Leu Ala Leu Tyr Val Thr Gin 

235 240 



Ala Leu Thr Leu Phe Glu Gly Leu 
250 255 



Pro Phe He Val Val Val Val Gly Phe Lys His Lys He Lys He Ala 

260 265 270 

Gin Tyr Ala Leu Glu Lys Phe Glu Arg Val Gly Leu Ser Lys Arg He 

275 280 285 

Thr Thr Asp Glu He val Phe Glu Ser Val Ser Glu Glu Gly Gly Arg 

290 295 300 

Leu He Gin Asp His Leu Leu Cys He Phe Ala Phe He Gly Cys Ser 
305 310 315 320 

Met Tyr Ala His Gin Leu Lys Thr Leu Thr Asn Phe Cys He Leu Ser 

325 330 335 

Ala Phe He Leu He Phe Glu Leu lie Leu Thr Pro Thr Phe Tyr Ser 

340 345 350 



Ala He Leu Ala Leu 
355 



Leu Glu Met Asn Val He His Arg Ser Thr 
360 365 



He He Lys Gin Thr Leu Glu Glu Asp Gly Val Val Pro Ser Thr Ala 

370 375 380 



50 



55 



33 



EP0486290A2 



Arg lie lie Ser Lys Ala Glu Lys Lys Ser Val Ser Ser Fhe Leu Aan 
385 390 395 400 

Leu Ser Val Val Val lie lie Met Lys Leu Ser Val He Leu Leu Fhe 

405 410 415 



10 



16 



Val Fhe He Asn Fhe Tyr Asn Phe Gly Ala Asn Trp Val Asn Asp Ala 

420 425 430 



Fhe Asn Ser Leu Tyr Fhe Asp Lys Glu Arg Val 
435 440 



Leu Pro Asp Fhe 
445 



He Thr 

450 



Ser Asn Ala Ser Glu Asn Fhe Lys Glu Gin Ala He Val 

455 460 



20 



val Thr 

465 



Leu Leu Tyr Tyr Lys Pro He Lys 

470 475 



Glu Asp Met Val Leu Leu Leu Leu Arg Asn Val 

485 490 



Tyr Gin Arg He 

480 

Val Ala He Arg 
495 



25 



Asp Arg Fhe Val 

500 



Lys Leu Val Leu 

505 



Ala Leu Val Cys Ser Ala 

510 



val He Asn Val Tyr Leu Leu Asn Ala Ala Arg He His Thr Ser Tyr 

515 520 525 



30 



Thr Ala Asp Gin Leu Val Lys Thr Glu Val Thr Lys Lys Ser Fhe Thr 

530 535 540 



Ala Pro Val Gin Lys Ala 
545 550 



Thr Pro Val Leu Thr Asn Lys Thr Val 

560 



35 



He ser 



Gly Ser Lys Val Lys Ser Leu Ser 
565 570 



Ser Ala Gin Ser Ser Ser 

575 



Ser Gly Pro 



Ser Ser 
580 



Glu Glu Asp Asp 
585 



Arg Asp He Glu 
590 



40 



Ser Leu Asp Lys Lys He Arg Pro Leu Glu Glu Leu Glu Ala Leu Leu 

595 600 605 



Ser Ser Gly Asn Thr Lys Gin Leu Lys Asn Lys Glu Val Ala Ala Leu 

610 615 620 



45 



Val He His Gly Lys Leu Pro Leu Tyr Ala Leu Glu Lys Lys Leu Gly 
625 630 635 640 



50 



Asp Thr Thr Arg Ala Val Ala Val Arg Arg Lys Ala Leu Ser He Leu 

645 650 655 

Ala Glu Ala Pro Val Leu Ala Ser Asp Arg Leu Pro Tyr Lys Asn Tyr 



55 
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660 



665 



670 



Asp Tyr Asp Arg Val Phe Gly Ala Cys Cys Glu Asn Val lie Gly Tyr 
675 680 685 



10 



15 



20 



25 



30 



35 



40 



45 



Met Pro Leu Pro Val Gly Val He Gly 
690 695 



Leu Val He Asp Gly Thr 
700 



Ser Tyr His He 
705 



Met Ala Thr Thr Glu Gly Cys Leu Val Ala ser 
710 715 720 



Ala Met Arg Gly Cys Lys Ala He Asn Ala Gly Gly Gly Ala Thr Thr 

725 730 735 

Val Leu Thr Lys Asp Gly Met Thr Arg Gly Pro Val Val Arg Phe Pro 

740 745 750 

Thr Leu Lys Arg Ser Gly Ala Cys Lys He Trp Leu Asp Ser Glu Glu 
755 760 765 



Gly Gin Asn Ala He Lys Lys Ala Phe Asn 

770 775 



Thr 
780 



ser Arg Phe Ala 



Arg Leu Gin His He Gin Thr Cys Leu Ala Gly Asp Leu Leu Phe Met 

785 790 795 800 

Arg Phe Arg Thr Thr Thr Gly Asp Ala Met Gly Met Asn Met He Ser 

805 810 815 

Lys Gly Val Glu Tyr Ser Leu Lys Gin Met Val Glu Glu Tyr Gly Trp 

820 825 830 

Glu Asp Met Glu Val Val Ser Val Ser Gly Asn Tyr Cys Thr Asp Lys 
835 840 845 

Lys Pro Ala Ala He Asn Trp He Glu Gly Arg Gly Lys Ser Val Val 
850 855 860 

Ala Glu Ala Thr He Pro Gly Asp Val val Arg Lys Val Leu Lys Ser 

865 870 875 880 

Asp Val Ser Ala Leu Val Glu Leu Asn He Ala Lys Asn Leu Val Gly 

885 890 895 



Ser Ala Met Ala Gly 

900 



Val Gly Gly Phe Asn Ala His Ala Ala Asn 
905 910 



50 



Leu Val Thr Ala Val Phe Leu Ala Leu Gly Gin Asp Pro Ala Gin Asn 

915 920 925 

Val Glu Ser Ser Asn Cys He Thr Leu Met Lys Glu Val Asp Gly Asp 
930 935 940 
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10 



15 



Leu Arg He Ser Val Ser Met Pro Ser He Glu Val Gly Thr He Gly 
945 950 955 960 

Gly Gly Thr Val Leu Glu Pro Gin Gly Ala Met Leu Asp Leu Leu Gly 

965 970 975 

Val Arg Gly Pro His Ala Thr Ala Pro Gly Thr Asn Ala Arg Gin Leu 

980 985 990 

Ala Arg He Val Ala Cye Ala Val Leu Ala Gly Glu Leu Ser Leu Cys 
995 X000 1005 

Ala Ala Leu Ala Ala Gly His Leu Val Gin Ser His Met Thr His Asn 

1010 1015 1020 

Arg Lya Pro Ala Glu Pro Thr Lys Pro Asn Asn Leu Asp Ala Thr Asp 
1025 1030 1035 1040 

He Asn. Arg Leu Lys Asp Gly Ser Val Thr Cys He Lys ser 

1045 1050 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 4768 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



20 



30 



35 



50 



(ii) MOLECULE TYPE: CDNA 



(lx) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION: 164.. 2827 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

TGTATGTCTT GTCTTTCTCC TAAGGGGCGT AGGCTCATTG ATAACTCATG TCCTCACCTT 60 

GCACTCCTTT TGGAATTATT TGGTTTGAGT GAAGAAGACC GGACCTTCGA GGTTCGCAAC 120 

40 TTAAACAATA GACTTGTGAG GATCCAGGGA CCCAGTCGCT ACA ATG TTG TCA CGA 115 

Net Leu Ser Arg 
1 

CTT TTC CGT ATG CAT GGC CTC TTT GTG GCC TCC CAT CCC TGG GAA GTT 223 
Leu Phe Arg Met His Gly Leu Phe Val Ala Ser His Pro Trp Glu Val 

45 5 10 15 20 
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ATT GTG GGG ACG GTG ACA CTT ACC ATC TGT ATC ATC TCC ATG AAC ATG 
lie Val Gly Thr Val Thr Leu Thr lie cys Met Met Ser Met Asn Met 

25 30 35 

TTC ACT GCC AAC AAC AAG ATC TGT GGT TGG AAT TAC GAG TGC CCA AAA 
Phe Thr Gly Asn Asn Lys lie Cys Gly Trp Asn Tyr Glu Cya Pro Lye 

40 45 50 

TTT GAG GAG GAT GTA TTG AGC ACT GAC ATC ATC ATC CTC ACC ATA ACA 
Phe Glu Glu Asp Val Leu Ser Ser Asp lie lie lie Leu Thr He Thr 

60 65 



CGG TGC ATC GCC ATC CTG TAC ATT TAC TTC CAG TTC CAG AAC TTA CGT 
Arg Cya He Ala He Leu Tyr He Tyr Phe Gin Phe Gin Asn Leu Arg 
70 75 80 

CAG CTT GGG TCG AAG TAT ATT TTA GGT ATT CCT GGC CTG TTC ACA ATT 
Gin Leu Gly Ser Lys Tyr He Leu Gly He Ala Gly Leu Phe Thr He 
85 90 95 100 

TTC TCA AGT TTT GTC TTT ACT ACA GTC GTC ATT CAC TTC TTA GAC AAA 

Phe Ser Ser Phe Val Phe Ser Thr Val Val He His Phe Leu Asp Lys 

105 no us 

GAA CTG ACG GGC TTA AAT GAA GCT TTG CCC TTT TTC CTG CTT TTG ATT 
Glu Leu Thr Gly Leu Asn Glu Ala Leu Pro Phe Phe Leu Leu Leu He 

120 125 130 

GAC CTT TCT AGA GCG AGT GCA CTA GCA AAG TTT GCC CTA AGT TCA AAC 
Asp Leu Ser Arg Ala Ser Ala Leu Ala Lys Phe Ala Leu ser Ser Asn 
135 140 145 

TCT CAG GAT GAA GTA AGG GAA AAT ATA GCT CGC GGA ATG GCA ATT CTG 

Ser Gin Asp Glu Val Arg Glu Asn He Ala Arg Gly Met Ala He Leu 
150 155 160 

GGC CCC ACA TTC ACC CTT GAT GCT CTT GTG GAA TGT CTT GTA ATT GGA 
Gly Pro Thr Phe Thr Leu Asp Ala Leu Val Glu Cys Leu Val He Gly 
165 170 175 180 

STT GGC ACC ATG TCA GGG GTG CGT CAG CTT GAA ATC ATG TGC TGC TTT 
Val Gly Thr Met Ser Gly Val Arg Gin Leu Glu He Met Cys Cys Phe 

185 190 195 

GGC TGC ATG TCT GTG CTT GCC AAC TAC TTC GTG TTC ATG ACA TTT TTC 
Gly cys Met Ser Val Leu Ala Asn Tyr Phe Val Phe Met Thr Phe Phe 

200 205 210 

CCA GCG TGT GTG TCC CTG GTC CTT GAG CTT TCT CGG GAA ACT CGA GAG 
Pro Ala Cys Val Ser Leu Val Leu Glu Leu Ser Arg Glu Ser Arg Glu 
215 220 225 
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5 

GOT CGT CCA ATT TGG CAC CTT AGC CAT TTT GCC CGA GTT TTG GAA GAA 895 
Gly Arg Pro lie Trp Gin Leu Sar His Phe Ala Arg Val Leu Glu Glu 
230 235 240 



10 



GAA GAG AAT AAA CCA AAC CCT GTA ACC CAA AGG GTC AAG ATG ATT ATG 943 
Glu Glu Asn Lys Pro Asn Pro Val Thr Gin Arg Val Lys Mat Xlm Hat 
245 250 255 260 

TCT TTA GGT TTG GTT CTT GTT CAT CCT CAC ACT CGA TGG ATA CCT GAT 991 
Sar Leu Gly Leu Val Lea Val His Ala His Ser Arg Trp lie Ala Asp 

265 270 275 

15 CCT TCC CCT CAG AAT AGC AGA ACA GAA CAT TCT AAA GTC TCC TTG GGA 1039 
Pro Ser Pro Gin Asn Ser Thr Thr Glu His Ser Lys Val Ser Leu Gly 

280 285 290 

CTG GAT GAA GAT GTG TCC AAG AGA ATT GAA CCA ACT GTT TCT CTC TGG 1087 
Leu Asp Glu Asp Val Ser Lys Arg He Glu Pro Ser Val Ser Leu Trp 
20 295 300 305 

CAG TTT TAT CTC TCC AAG ATG ATC AGC ATG GAC ATT GAA CAA GTG GTT 1135 
Gin Phe Tyr Leu Ser Lys Net He Ser Met Asp He Glu Gin Val Val 
310 315 320 



25 



30 



ACC CTG AGC TTA GCT TTT CTG TTG GCT GTC AAG TAC ATT TTC TTT GAA 1183 
Thr Leu Ser Leu Ala Phe Leu Leu Ala Val Lys Tyr He Phe Phe Glu 
325 330 335 340 

CAA GCA GAG ACA GAG TCC ACA CTG TCT TTA AAA AAT CCT ATC ACG TCT 1231 
Gin Ala Glu Thr Glu Ser Thr Leu Ser Leu Lys Asn Pro He Thr Ser 

345 350 355 

CCT GTC GTG ACC CCA AAG AAA GCT CCA GAC AAC TGT TGT AGA CGG GAG 1279 
Pro Val Val Thr Pro Lys Lys Ala Pro Asp Asn Cys Cys Arg Arg Glu 

360 365 370 

CCT CTG CTT GTG AGA AGG AGC GAG AAG CTT TCA TCG GTT GAG GAG GAG 1327 
35 Pro Leu Leu Val Arg Arg Ser Glu Lys Leu Ser Ser Val Glu Glu Glu 

375 380 385 

CCT GGG GTG AGC CAA GAT AGA AAA GTT GAG GTT ATA AAA CCA TTA GTG 1375 
Pro Gly Val Ser Gin Asp Arg Lys Val Glu Val He Lys Pro Leu Val 
390 395 400 

40 GTG GAA ACT GAG AGT GCA AGC AGA GCT ACA TTT GTG CTT GGC GCC TCT 1423 
Val Glu Thr Glu Ser Ala Ser Arg Ala Thr Phe Val Leu Gly Ala Ser 
405 410 415 420 

GGG ACC AGC CCT CCA GTG GCA GCG AGG ACA CAG GAG CTT GAA ATT GAA 1471 

Gly Thr Ser Pro Pro Val Ala Ala Arg Thr Gin Glu Leu Glu He Glu 

45 425 430 435 



50 
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25 



CTC CCC ACT GAG CCT CGG CCT AAT GAA GAA TGT CTG CAG ATA CTG GAG 1519 
Leu Pro Ser Glu Pro Arg Pro Asn Glu Glu Cys Leu Gin He Leu Glu 

440 443 450 

10 ACT GCC GAG AAA GGT GCA AAG TTC CTT AGO GAT GCA GAG ATC ATC CAG 1567 
Ser Ala Glu Lys Gly Ala Lys Phe Leu Ser Asp Ala Glu He He Gin 
455 460 465 

TTG GTC AAT GCC AAG CAC ATC CCA GCC TAC AAA TTG GAA ACC TTA ATC 1615 
Leu Val Aen Ala Lys Hie He Pro Ala Tyr Lye Leu Glu Thr Leu Mat 
f5 470 475 480 

GAA ACT CAT GAA CGT GGT GTA TCT ATT CGC CGG CAG CTC CTC TCC ACA 1663 
Glu Thr His Glu Arg Gly Val Ser He Arg Arg Gin Leu Leu Ser Thr 

485 490 495 500 

AAG CTT CCA GAG CCT TCT TCT CTG CAG TAC CTG CCT TAC AGA GAT TAT 1711 
Lye Leu Pro Glu Pro Ser Ser Leu Gin Tyr Leu Pro Tyr Arg Asp Tyr 

505 510 515 

AAT TAT TCC CTG GTG ATG GGA CCT TGC TGT GAG AAT GTG ATC GGA TAT 1759 
Asn Tyr Ser Leu Val Met Gly Ala Cys Cys Glu Asn Val He Gly Tyr 

520 525 530 

ATG CCC ATC CCT GTC GGA GTA GCA GGG CCT CTG TGC CTG GAT GGT AAA 1807 
Met Pro He Pro Val Gly Val Ala Gly Pro Leu Cys Leu Asp Gly Lys 
535 540 545 

GAG TAC CAG GTT CCA ATG GCA ACA ACG GAA GGC TGT CTG GTG GCC AGC 1855 
Glu Tyr Gin Val Pro Met Ala Thr Thr Glu Gly Cys Leu Val Ala Ser 
» 550 555 560 

ACC AAC AGA GGC TGC AGG GCA ATA GGT CTT GGT GGA GGT GCC AGC AGC 1903 
Thr Asn Arg Gly Cys Arg Ala He Gly Leu Gly Gly Gly Ala Ser Ser 
565 570 575 580 

35 CGG GTC CTT CCA GAT GGG ATG ACC CGG CCC CCA GTG GTG CGT CTT CCT 1951 

Arg Val Leu Ala Asp Gly Met Thr Arg Gly Pro Val Val Arg Leu Pro 

585 590 595 

CGT GCT TGT GAT TCT GCA GAA GTG AAG GCC TGG CTT GAA ACA CCC GAA 1999 

Arg Ala Cys Asp Ser Ala Glu Val Lys Ala Trp Leu Glu Thr Pro Glu 

600 605 610 

40 

GGG TTT GCG GTG ATA AAG GAC GCC TTC GAT AGC ACT AGC AGA TTT GCA 2047 

Gly Phe Ala Val He Lys Asp Ala Phe Asp Ser Thr Ser Arg Phe Ala 
615 620 625 

CGT CTA CAG AAG CTT CAT GTG ACC ATG GCA GGG CGC AAC CTG TAC ATC 2095 
45 Arg Leu Gin Lys Leu His Val Thr Ket Ala Gly Arg Asn Leu Tyr He 
630 635 640 



60 
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CGT TTC CAG TCC AAG ACA GGG CAT CCC ATG GGG ATG AAC ATG ATT TCC 
Arg Phe Gin Ser Lys Thr Gly Asp Ala Kat Gly Met Asn Mat lie ser 
<45 6S0 655 660 

AAG GGC ACT GAG AAA GCA CTT CTG AAG CTT CAG GAG TTC TTT CCT GAA 
Lys Gly Thr Glu Lys Ala Leu Lou Lys Leu Gin Glu Phe Phe Pro Glu 

665 670 675 

ATG CAG ATT CTG CCA CTT AGT GGT AAC TAC TGC ACT GAC AAG AAA CCT 
Met Gin He Leu Ala Val Ser Gly Asn Tyr Cys Thr Asp Lys Lys Pro 

680 685 690 

GCC GCC ATA AAC TGG ATC GAG GGA AGA GGA AAG ACA CTT GTG TGT GAA 
Ala Ala He Asn Trp He Glu Gly Arg Gly Lys Thr Val Val Cys Glu 
695 700 705 

GCT GTT ATT CCA GCC AAG GTG GTG AGA GAA GTA TTA AAG ACA ACT ACG 
Ala Val He Pro Ala Lys Val Val Arg Glu Val Leu Lys Thr Thr Thr 
710 715 720 

GAA GCT ATG ATT GAC GTA AAC ATT AAC AAG AAT CTT GTG GGT TCT GCC 
Glu Ala Met He Asp Val Asn He Asn Lys Asn Leu Val Gly Ser Ala 
725 730 735 740 

ATG GCT GGG AGC ATA GGA GGC TAC AAT GCC CAT GCA GCA AAC ATC GTC 
Met Ala Gly Ser He Gly Gly Tyr Asn Ala His Ala Ala Asn He Val 

745 750 755 

ACT GCT ATC TAC ATT GCA TGT GGC CAG CAT GCA GCA CAG AAT GTG GGG 
Thr Ala He Tyr He Ala Cys Gly Gin Asp Ala Ala Gin Asn Val Gly 

760 765 770 

AGT TCA AAC TGT ATT ACT TTA ATG GAA GCA AGT GGT CCC ACG AAT GAA 
Ser Ser Asn Cys He Thr Leu Met Glu Ala Ser Gly Pro Thr Asn Glu 
775 780 785 

GAC TTG TAT ATC AGC TGC ACC ATG CCA TCT ATA GAG ATA GGA ACT GTC 
Asp Leu Tyr He ser Cys Thr Met Pro Ser He Glu He Gly Thr Val 

790 795 800 

GGT GGT GGG ACC AAC CTC CTA CCA CAG CAG GCC TGT CTG CAG ATG CTA 
Gly Gly Gly Thr Asn Leu Leu Pro Gin Gin Ala Cys Leu Gin Met Leu 
805 810 815 820 

GCT GTT CAA GCA GCG TGC AAA GAC AAT CCT GGA GAA AAT GCA CGG CAA 
Gly Val Gin Gly Ala Cys Lys Asp Asn Pro Gly Glu Asn Ala Arg Gin 

825 830 835 

CTT GCC CGA ATT GTG TGT GGT ACT GTA ATC GCT GGG GAG TTG TCC TTG 
Leu Ala Arg He Val Cys Gly Thr Val Met Ala Gly Glu Leu Ser Leu 

840 845 850 
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ATG GCA GCA TTC GCA GCA GGA CAT CTT GTT AGA ACT CAC ATG GTT CAT 2767 
Kat Ala Ala Lau Ala Ala Gly Bis Lau Val Arg Sar His Mat Val His 
855 860 865 

AAC AGA TCG AAG ATA AAT TTA CAA GAT CTG CAA GGA ACG TGC ACC AAG 2815 
Asn Arg Sar Lya Xla Asn Lau Gin Asp Leu Gin Cly Thr Cyo Thr Lya 

370 875 880 

AAG TCA GCT TGAGCAGCCT GACAGTATTG AACTGAAACA CGGGGATTGG 2864 

Lya Sar Ala 

885 



GTTCTCAAGG ACTAACATGA AATCTGTGAA TTAAAAATCT CAATGCAGTG TCTTGTGGAA 2924 

GATGAATGAA CGTGATCAGT GAGACGCCTG CTTGGTTTCT GGCTCTTTCA GAGACGTCTG 2984 

AGCTCCTTTG CTCGGAGACT CCTCAGATCT GGAAACAGTG TGGTCCTTCC CATGCTGTAT 3044 

TCTGAAAAGA TCTCATATGG AT G TT G T G CT CTGAGCACCA CAGATCTGAT CTGCAGCTCG 3104 

TTTCTCAAAT GATGGAGTTC ATGGTGATCA CTGTGAGACT GGCCTCTCCC AGCAGGTTAA 3164 

AAATGGAGTT TTAAATTATA CTGTAGCTGA CAGTACTTCT GATTTTATAT TTATTTAGTC 3224 

25 TGAGTTGTAG AACTTTCCAA TCTAAGTTTA TTTTTTGTAA CCTAATAATT CATTTGGTGC 3284 

TGCTCTATTG ATTTTTGGGG GTAAACAATA TTATTCTTCA GAAGGGGACC TACTTCTTCA 334 4 

TGGGAAGAAT TACTTTTATT CTCAAACTAC AGAACAATGT GCTAAGCAGT GCTAAATTGT 3404 

30 TCTCATCAAG AAAACAGTCA CTGCATTTAT CTCTGTAGGC CTTTTTTCAG AGAGGCCTTG 3464 

TCTAGATTTT TGCCAGCTAG GCTACTGCAT GTCTTAGTCT CAGGCCTTAG GAAAGTGCCA 3 524 

CGCTCTGCAC TAAAGATATC AGAGCTCTTG GTGTTACTTA GACAAGAGTA TGAGCAAGTC 3 584 

GGACCTCTCA GAGTGTGGGA ACACAGTTTT GAAAGAAAAA CCATTTCTCT AAGCCAATTT 364 4 

TCTTTAAAGA CATTTTAACT TATTTAGCTG AGTTCTAGAT TTTTCGGGTA AACTATCAAA 37 04 

TCTGTATATG TTGTAATAAA GTGTCTTATG CTAGGAGTTT ATTCAAAGTG TTTAAGTAAT 3 764 

AAAAGGACTC AAATTTACAC TGATAAAATA CTCTAGCTTG GGCCAGAGAA GACAGTGCTC 3824 

ATTAGCGTTG TCCAGGAAAC CCTGCTTGCT TGCCAAGCCT AATGAAGGGA AAGTCAGCTT 3884 

TCAGAGCCAA TGATGGAGGC CACATGAATG GCCCTCGAGC TGTGTGCCTT GTTCTGTGGC 3944 

CAGGAGCTTG GTGACTGAAT CATTTACGGG CTCCTTTGAT GGACCCATAA AAGCTCTTAG 4004 

CTTCCTCAGG GGGTCAGCAG AGTTGTTGAA TCTTAATTTT TTTTTTAATG TACCAGTTTT 4064 



45 
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20 



GTATAAATAA 


TAATAAAGAG 


CTCCTTATTT 


TGTATT CT AT 


CTAATGCTTC 


GAGTTCAGTC 


4124 


TTGGGAAGCT 


GACATCTCAT 


GTAGAAGATG 


GACTCTGAAA 


GACATTCCAA 


GAGTGCAGCG 


4184 


GCATCATGGG 


AGCCXGTTAG 


TGATTGTGTG 


TCAGTATTAT 


TGTGGAAGAT 


TGACTTTGCT 


4244 


TTTCTATGTG 


AAGTTTCAGA 


TTGCTCCTCT 


TGTGACTOT 


TAGCCAGTAA 


CATTTTATTT 


4304 


ACCTGAGCTT 


GTCATGGAAG 


TGGCAGTGAA 


AAGTATTGAG 


TATTCATG CT 


GGTGACTGTA 


4364 


ACCAATGTCA 


TCTTGCTAAA 


AACTCATGTT 


TTGTACAATT 


ACTAAATTGT 


ATACATTTTG 


4424 


TTATAGAATA 


CTTTTTCCAG 


TTGAGTAAAT 


TATGAAAGGA 


AGTTAACATT 


AAGAGGTGTA 


4484 


AGCGGTGGCT 


TTTTTAAAAT 


GAAGGATTAA 


CCCTAAGCCC 


GAGACCCACA 


AGCTAGCAAA 


4544 


GTCTGGCAGA 


GTGGTAAACT 


GTCCTGGTGG 


GGCCATCGAA 


TCATCTCTCT 


CCATTACACT 


4604 


TTCTAACTTT 


CCAGCATTGG 


TGCTGGCCAG 


TGTATTGTTT 


CATTGATCTT 


CCTTACGCTT 


4664 


AGAGGGTTTG 


ATTGGTTCAG 


ATCTATAATC 


TCAGCCACAT 


TGTCTTGGTA 


TGAGCTGGAG 


4724 


AGAGTTAAGA 


GGAAGGGAAA 


ATAAAGTTCA 


GATAGCCAAA 


ACAC 




4768 



25 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 887 amino acids 

(B) TYPE: amino acid 
30 (O) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

35 Met Leu Ser Arg Leu Phe Arg Met His Gly Leu Phe Val Ala Ser His 
15 10 15 

Pro Trp Glu Val He Val Gly Thr Val Thr Leu Thr He Cya Met Met 

20 25 30 

Ser Met Asn Met Phe Thr Gly Asn Asn Lys He Cys Gly Trp Asn Tyr 
40 35 40 45 

Glu Cys Pro Lys Phe Glu Glu Asp Val Leu Ser ser Asp He He He 
50 55 60 

Leu Thr He Thr Arg Cys He Ala He Leu Tyr He Tyr Phe Gin Phe 
45 65 70 75 80 



so 
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Gin Asn Leu Arg Gin Leu Gly Ser Lys Tyr He Leu Gly He Ma Gly 

85 90 95 



Leu Phe Thr He Phe Ser 

100 



Phe Val Phe 
105 



Thr Val Val He His 
110 



Phe Leu Asp Lys Glu Leu Thr Gly Leu Asn Glu Ala Leu Pro Phe Phe 
115 120 125 



Leu Leu Leu He Asp Leu Ser Arg Ala Ser 
130 135 



Ala Leu Ala Lys Phe Ala 
140 



iff 



20 



Leu Ser 
145 



Ser Asn Ser Gin Asp Glu Val Arg Glu Asn He Ala Arg Gly 

150 155 160 



Met Ala He Leu Gly 

165 



Thr Phe Thr Leu Asp Ala Leu Val Glu Cys 

170 175 



Leu Val He Gly Val Gly Thr Met Ser Gly Val 

180 185 



Gin Leu Glu He 
190 



25 



Met Cys Cys Phe Gly Cys Met Ser Val Leu Ala Asn Tyr Phe Val Phe 
195 200 205 



Met Thr Phe Phe Pro Ala Cys Val Ser Leu Val Leu Glu Leu 
210 215 220 



Ser Arg 



30 



35 



Glu Ser Arg Glu Gly Arg 

225 230 



He Tip Gin Leu 

235 



His Phe Ala Arg 

240 



Val Leu Glu Glu Glu Glu Asn Lys Pro Asn Pro Val Thr Gin Arg Val 

245 250 255 



Lys Met He Met 

260 



Trp He Ala Asp Pro 
275 



Leu Gly Leu Val Leu Val His Ala His Ser Arg 

265 270 



Gin Asn Ser Thr Thr Glu His Ser Lys 
280 



40 



Val Ser Leu Gly Leu Asp Glu Asp Val Ser Lys Arg He Glu Pro Ser 

290 295 300 



Val Ser Leu Trp Gin Phe Tyr Leu Ser Lys Met He 

305 310 315 



Met Asp He 

320 



45 



Glu Gin Val Val Thr Leu 

325 



Leu Ala Phe Leu Leu Ala Val Lys Tyr 

330 335 



60 



He Phe Phe Glu Gin Ala Glu Thr Glu Ser Thr Leu Ser Leu Lys Asn 

340 345 350 
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Pro lie Thr 

355 



Pro Val Val Thr Pro Lys Lys Ala Pr Asp Asn Cys 

360 365 



Cys Arg Arg Glu Pro Leu Leu Val Arg Arg Ser Glu Lys Leu Ser Ser 

370 375 380 



Val Glu Glu Glu Pro Gly Val 
385 390 



Lys Pro Leu Val Val Glu Thr Glu 

405 



Gin Asp Arg Lys Val Glu Val He 
395 400 



Ala 
410 



Arg Ala Thr Phe Val 

415 



Leu Gly Ala Ser Gly Thr Ser 

420 



Pro Val Ala Ala 

425 



Thr Gin Glu 

430 



Leu Glu He Glu Leu Pro 
435 



Glu 
440 



Arg Pro Asn Glu Glu cys Leu 

445 



Gin He Leu Glu Ser Ala Glu Lys Gly Ala Lys Phe Leu Ser Asp Ala 

450 455 460 



Glu He He Gin Leu Val Asn Ala Lys His He 

465 470 475 



Ala Tyr Lys Leu 

480 



Glu Thr Leu Met Glu Thr His Glu Arg Gly Val 

485 490 



Ser He Arg 



Arg Gin 
495 



Leu Leu Ser 



Thr Lys Leu 
500 



Glu Pro 
505 



Ser Leu Gin Tyr Leu Pro 

510 



Tyr Arg Asp Tyr Asn Tyr Ser Leu Val Met Gly Ala Cys cys Glu Asn 

515 520 525 



He Pro val Gly val Ala Gly Pro Leu Cys 

540 



Val He Gly Tyr Met 
530 

Leu Asp Gly Lys Glu Tyr Gin Val Pro Met Ala Thr Thr Glu Gly Cys 

545 550 555 560 

Leu Val Ala Ser Thr Asn Arg Gly Cys Arg Ala He Gly Leu Gly Gly 

565 570 575 

Gly Ala Ser Ser Arg Val Leu Ala Asp Gly Met Thr Arg Gly Pro Val 

580 585 590 

Val Arg Leu Pro Arg Ala Cys Asp Ser Ala Glu Val Lys Ala Trp Leu 

595 600 605 



Glu Thr Pro Glu Gly Phe Ala Val He Lys Asp Ala Phe Asp Ser Thr 

610 615 620 
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10 



Ser Arg Phe Ala 
€25 



Leu Gin Lys Leu His Val Thr Met Ala Gly Arg 
630 635. 640 



Asn Leu Tyr lie Arg Phe Gin 

645 



Asn Met lie Ser Lys Gly 

660 



Lys Thr Gly Asp Ala Met Gly Met 

650 655 



Glu Lys Ala Leu Leu Lys Leu Gin Glu 
665 670 



19 



Phe Phe Pro Glu Met Gin He Leu Ala Val 
675 680 



Gly Asn Tyr Cys Thr 
685 



Asp Lys Lys Pro Ala Ala He Asn Trp He Glu Gly Arg Gly Lys Thr 

690 695 700 



20 



25 



30 



Val Val Cys Glu Ala Val He 
705 710 



Ala Lys Val Val 
715 



Glu Val Leu 
720 



Lys Thr Thr Thr Glu Ala Met He Asp Val Asn He Asn Lys Asn Leu 

725 730 735 



Val Gly Ser 



Ala Met Ala Gly 
740 



He Gly Gly Tyr Asn Ala His Ala 
745 750 



Ala Asn lie val Thr Ala lie Tyr He Ala Cys Gly Gin Asp Ala Ala 
755 760 765 



Gin Asn Val Gly Ser 
770 



Asn Cys He Thr Leu Met Glu Ala Ser Gly 

775 780 



Pro Thr Asn Glu Asp Leu Tyr He 
785 790 



Cys Thr Met Pro 

795 



He Glu 

800 



35 



40 



He Gly Thr val Gly Gly Gly Thr Asn Leu Leu Pro Gin Gin Ala Cys 

805 810 815 

Leu Gin Met Leu Gly Val Gin Gly Ala Cys Lys Asp Asn Pro Gly Glu 

820 825' 830 

Asn Ala Arg Gin Leu Ala Arg He Val Cys Gly Thr Val Met Ala Gly 

835 840 845 

Glu Leu Ser Leu Met Ala Ala Leu Ala Ala Gly His Leu Val Arg Ser 
850 855 860 



45 



His Met Val His Asn Arg 
865 870 



Thr Cys Thr Lys Lys ser Ala 

885 



Lys He Asn Leu Gin Asp Leu Gin Gly 

875 880 



55 
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(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH i 3348 base pairs 

(B) TYPE: nucleic acid 

10 (C) STRANDEDNESS: single 

(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

1S (iv) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 121.. 3255 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

2° GGAATATTTT GTACGAGCAA GTTATAGTAA GACACTTCAG TGAGAAATTA ATCTGACTTA 60 

CTTTTACTTA ATTGTGTTCT TTCCAAATTA GTTCAACAAG GTTCCCACAT ACAACCTCAA 120 

ATG TCA CTT CCC TTA AAA ACG ATA GTA CAT TTG GTA AAG CCC TTT GCT 168 
Met Ser Leu Pro Leu Lye Thr lie Val His Leu Val Lya Pro Phe Ala 
25 1 5 10 15 

TGC ACT GCT AGG TTT ACT GCG AGA TAC CCA ATC CAC GTC ATT GTT GTT 216 
Cy* Thr Ala Arg Phe Ser Ala Arg Tyr Pro lie His Val lie Val Val 

20 25 30 



30 



35 



GCT GTT TTA TTG AGT GCC GCT GCT TAT CTA TCC GTG ACA CAA TCT TAC 264 
Ala Val Leu Leu Ser Ala Ala Ala Tyr Leu Ser Val Thr Gin Ser Tyr 
35 40 45 

CTT AAC GAA TGG.AAG CTG GAC TCT AAT CAG TAT TCT ACA TAC TTA AGC 312 
Leu Asn Glu Trp Lys Leu Asp Ser Asn Gin Tyr Ser Thr Tyr Leu Ser 
50 55 60 

ATA AAG CCG GAT GAG TTG TTT GAA AAA TGC ACA CAC TAC TAT AGG TCT 360 

lie Lys Pro Asp Glu Leu Phe Glu Lys Cys Thr His Tyr Tyr Arg Ser 
65 70 75 80 

CCT GTG TCT GAT ACA TGG AAG TTA CTC AGC TCT AAA GAA GCC GCC GAT 408 

40 Pro Val Ser Asp Thr Trp Lys Leu Leu Ser Ser Lys Glu Ala Ala Asp 

85 90 95 

ATT TAT ACC CCT TTT CAT TAT TAT TTG TCT ACC ATA AGT TTT CAA ACT 436 
lie Tyr Thr Pro Phe His Tyr Tyr Leu Ser Thr lie Ser Phe Gin Ser 

100 105 110 

46 



60 
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AAG GAC AAT TCA ACG ACT TTG CCT TCC CTT GAT GAC CTT ATT TAC ACT 
Lye Asp Asn Ser Thr Thr Leu Pro Ser Leu Ai p Up Val He Tyr Ser 
115 120 125 

GTT GAC CAT ACC AGG TAC TTA TTA ACT GAA GAG CCA AAG ATA CCA ACT 

Val Asp Hia Thr Arg Tyr Leu Leu Ser Glu Glu Pro Lys Ha Pro Thr 

130 135 140 

GAA CTA GTG TCT GAA AAC GGA ACG AAA TGG AGA TTG AGA AAC AAC AGC 

Glu Leu Val Ser Glu Asn Gly Thr Lys Trp Arg Leu Arg Asn Asn Ser 
145 150 155 160 

AAT TTT ATT TTG GAC CTG CAT AAT ATT TAC CGA AAT ATG GTG AAG CAA 
Asn Phe lis Lsu Asp Lau His Asn He Tyr Arg Asn Mat val Lys Gin 

165 170 175 

TTT TCT AAC AAA ACG AGC GAA TTT GAT CAG TTC GAT TTG TTT ATC ATC 
Fha Sar Asn Lys Thr Sar Glu Pha Asp Gin Pha Asp Leu Phe Ha He 

180 185 190 



CTA GCT GCT TAC CTT ACT CTT TTT TAT ACT CTC TCT TGC CTG TTT AAT 
Leu Ala Ala Tyr Leu Thr Leu Pha Tyr Thr Leu Cys Cys Leu Phe Asn 
195 200 205 

GAC ATG AGG AAA ATC GGA TCA AAG TTT TGG TTA AGC TTT TCT GCT CTT 

Asp Xat Arg Lys lie Gly Ser Lys Phe Trp Leu Ser Pha sar Ala Lau 
210 215 220 

TCA AAC TCT GCA TCC CCA TTA TAT TTA TCC CTG TAC ACA ACT CAC AGT 
Sar Asn Sar Ala Cys Ala Lau Tyr Lau Ser Lau Tyr Thr Thr His Ser 

225 230 235 240 

TTA TTG AAG AAA CCG GCT TCC TTA TTA AGT TTG GTC ATT GGA CTA CCA 
Leu Leu Lys Lys Pro Ala Ser Lau Leu Sar Leu Val Ha Gly Lau Pro 

245 250 255 



TTT ATC GTA GTA ATT ATT GGC TTT AAG CAT AAA GTT CGA CTT GCG GCA 
Pha He Val Val Ha Ha Gly Pha Lys His Lys Val Arg Leu Ala Ala 

260 265 270 

TTC TCG CTA CAA AAA TTC CAC AGA ATT ACT ATT GAC AAG AAA ATA ACG 
Pha Ser Leu Gin Lys Phe His Arg He Ser He Asp Lys Lys Ha Thr 
275 280 285 

GTA AGC AAC ATT ATT TAT GAG GCT ATG TTT CAA GAA GCT GCC TAC TTA 
Val Ser Asn He He Tyr Glu Ala Met Phe Gin Glu Gly Ala Tyr Leu 
290 295 300 

ATC CGC CAC TAC TTA TTT TAT ATT AGC TCC TTC ATT CGA TGT GCT ATT 
He Arg Asp Tyr. Lau Pha Tyr He Sar Ser Phe He Gly Cys Ala Ha 
305 310 315 320 
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TAT GCT AGA CAT CTT CCC GGA TTC GTC AAT TTC TCT ATT TTG TCT ACA 

Tyr Ala Arg His Leu Pro Gly Leu Val Asn Phe Cys lie Leu Sex Thr 

325 330 335 

TTT ATG CTA GTT TTC 6AC TTG CTT TTG TCT GCT ACT TTT TAT TCT GCC 
Phe Met Leu Val Phe Asp Leu Leu Leu Ser Ala Thr Phe Tyr Ser Ala 

340 345 350 

ATT TTA TCA ATG AAG CTG GAA ATT AAC ATC ATT CAC AGA TCA ACC GTC 
lie Leu Ser Met Lys Leu Glu lie Asn lie lie His Arg Ser Thr Val 

355 360 365 

ATC AGA GAG ACT TTG GAA GAG GAC GGA GTT GTC CCA ACT ACA GGA GAT 

He Arg Gin Thr Leu Glu Glu Asp Gly Val Val Pro Thr Thr Ala Asp 

370 375 380 

ATT ATA TAT AAG GAT GAA ACT GCC TCA GAA CCA CAT TTT TTG AGA TCT 
He He Tyr Lys Asp Glu Thr Ala Ser Glu Pro His Phe Leu Arg Ser 
385 390 395 400 

AAC GTG GCT ATC ATT CTG GGA AAA CCA TCA CTT ATT GGT CTT TTG CTT 
Asn Val Ala He He Leu Gly Lys Ala Ser Val He Gly Leu Leu Leu 

405 410 415 

CTG ATC AAC CTT TAT GTT TTC ACA GAT AAG TTA AAT GCT ACA ATA CTA 
Leu He Asn Leu Tyr Val Phe Thr Asp Lys Leu Asn Ala Thr He Leu 

420 425 430 

AAC ACG GTA TAT TTT GAC TCT ACA ATT TAG TCG TTA CCA AAT TTT ATC 
Asn Thr Val Tyr Phe Asp Ser Thr He Tyr Ser Leu Pro Asn Phe He 
435 440 445 

AAT TAT AAA GAT ATT GGC AAT CTC ACC AAT CAA GTG ATC ATT TCC GTG 
Asn Tyr Lys Asp He Gly Asn Leu Ser Asn Gin Val He He Ser Val 

450 455 460 

TTG CCA AAG CAA TAT TAT ACT CCG CTG AAA AAA TAC CAT CAG ATC GAA 
Leu Pro Lys Gin Tyr Tyr Thr Pro Leu Lys Lys Tyr His Gin He Glu 

465 470 475 480 

CAT TCT GTT CTA CTT ATC ATT GAT TCC GTT AGC AAT GCT ATT CGG GAC 
Asp Ser Val Leu Leu He He Asp Ser Val Ser Asn Ala He Arg Asp 

485 490 495 

CAA TTT ATC AGC AAG TTA CTT TTT TTT GCA TTT GCA GTT AGT ATT TCC 
Gin Phe He ser Lys Leu Leu Phe Phe Ala Phe Ala Val Ser He Ser 

500 505 510 

ATC AAT GTC TAC TTA CTG AAT GCT GCA AAA ATT CAC ACA GGA TAC ATG 
He Asn Val Tyr Leu Leu Asn Ala Ala Lys He His Thr Gly Tyr Met 
515 520 525 
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AAC TTC CAA CCA CAA TCA AAT AAG ATC GAT GAT CTT GTT GTT CAG CAA 
Asn Phe Gin Pro Gin Ser Asn Lys lie Asp Asp Leu Val Val Gin Gin 
530 535 540 

AAA TCG GCA ACG ATT CAG TTT TCA GAA ACT CGA AGT ATG CCT GCT TCT 

Lys Ser Ala Thr He Glu Phe Ser Glu Thr Arg Ser Met Pro Ala Ser 

545 550 555 560 

TCT GGC CTA GAA ACT CCA GTG ACC GOG AAA GAT ATA ATT ATC TCT GAA 
sar Gly Leu Glu Thr Pro Val Thr Ala Lys Asp I la Ila zla Sar Glu 

565 570 575 

GAA ATC CAG AAT AAC GAA TGC GTC TAT GCT TTG AGT TCC CAG GAC GAG 

Glu Ila Gin Asn Asn Glu Cya Val Tyr Ala Leu sar Ser Gin Asp Glu 

580 585 590 

CCT ATC CGT CCT TTA TCG AAT TTA GTG GAA CTT ATG GAG AAA GAA CAA 
Pro Ila Arg Pro Lau Sar Asn Lau Val Glu Lau Met Glu Lys Glu Gin 
595 600 605 

TTA AAG AAC ATG AAT AAT ACT GAG GTT TCG AAT CTT GTC GTC AAC GGT 
Leu Lys Asn Met Asn Asn Thr Glu Val Ser Asn Leu Val Val Asn Gly 
610 615 620 

AAA CTG CCA TTA TAT TCC TTA GAG AAA AAA TTA GAG GAC ACA ACT CGT 
Lys Leu Pro Leu Tyr Ser Leu Glu Lys Lys Leu Glu Asp Thr Thr Arg 
625 630 635 640 

GCG GTT TTA GTT AGG AGA AAG GCA CTT TCA ACT TTG GCT GAA TCG CCA 

Ala Val Leu Val Arg Arg Lys Ala Leu Ser Thr Leu Ala Glu Ser Pro 

645 650 655 

ATT TTA GTT TCC GAA AAA TTG CCC TTC AGA AAT TAT GAT TAT GAT CCC 
lie Leu Val Ser Glu Lys Leu Pro Phe Arg Asn Tyr Asp Tyr Asp Arg 

660 665 670 

GTT TTT GGA GCT TGC TGT GAA AAT GTC ATC GGC TAT ATG CCA ATA CCA 

Val Phe Gly Ala Cys Cys Glu Asn Val He Gly Tyr Met Pro lie Pro 
675 680 685 



GTT GGT GTA ATT GGT CCA TTA ATT ATT GAT GGA ACA TCT TAT CAC ATA 

Val Gly Val He Gly Pro Leu He He Asp Gly Thr Ser Tyr His He 

690 695 700 

CCA ATG GCA ACC ACG GAA GGT TGT TTA GTG GCT TCA GCT ATG CGT GGT 
Pro Hat Ala Thr Thr Glu Gly Cys Leu Val Ala Ser Ala Met Arg Gly 

705 710 715 720 

TGC AAA CCC ATC AAT GCT GGT GGT CGT GCA ACA ACT GTT TTA ACC AAA 
Cys Lys Ala He Asn Ala Gly Gly Gly Ala Thr Thr Val Leu Thr Lys 

725 730 735 
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GAT GGT ATG ACT AGA GGC CCA GTC GTT CGT TTC CCT ACT TTA ATA AGA 2376 
Asp Gly Ket Tbr Arg Gly Pro Val Val Arg Phe Pro Thr Leu lie Arg 

740 745 750 

TCT GGT GCC TGC AAG ATA TGG TTA GAC TCG GAA GAG GGA CAA AAT TCA 2424 
Ser Gly Ala Cys Lys lie Trp Lau Asp Ser Glu Glu Gly Gin Asn Ser 
755 760 765 

ATT AAA AAA GCT TTT AAT TCT ACA TCA AGG TIT GCA CGT TTG CAA CAT 2472 

He Lys Lys Ala Phe Asn Sar Thr Ser Arg Phe Ala Arg Leu Gin His 

770 775 780 

ATT CAA ACC TGT CTA GCA GGC GAT TTG CTT TTT ATG AGA TTT CGG ACA 2520 
lie Gin Thr Cys Leu Ala Gly Asp Leu Leu Phe Met Arg Phe Arg Thr 
785 790 795 800 

20 ACT ACC GGT GAC GCA ATG GGT ATG AAC ATG ATA TCG AAA GGT GTC GAA 2568 
Thr Thr Gly Asp Ala Met Gly Ket Asn Met lie Ser Lys Gly Val Glu 

805 810 815 



16 



25 



30 



TAC TCT TTG AAA CAA ATG GTA GAA GAA TAT GGT TGG GAA GAT ATG GAA 2616 

Tyr Ser Leu Lys Gin Met Val Glu Glu Tyr Gly Trp Glu Asp Met Glu 

820 825 830 

GTT GTC TCC GTA TCT GGT AAC TAT TGT ACT GAT AAG AAA CCT GCC GCA 2664 

Val Val Ser Val Ser Gly Asn Tyr Cys Thr Asp Lys Lys Pro Ala Ala 
835 840 845 

ATC AAT TGG ATT GAA GGT CGT GGT AAA ACT GTC GTA GCT GAA GCT ACT 2712 

lie Asn Trp lie Glu Gly Arg Gly Lys Ser Val Val Ala Glu Ala Thr 
850 855 860 

ATT CCT GGT GAT GTC CTA AAA AGT GTT TTA AAG AGC GAT GTT TCC GCT 2760 
lie Pro Gly Asp Val Val Lys Ser Val Leu Lys Ser Asp Val Ser Ala 
865 870 875 880 

TTA GTT GAA TTA AAT ATA TCC AAG AAC TTG CTT GGA TCC GCA ATG GCT 2808 
Leu Val Glu Leu Asn lie Ser Lys Asn Leu Val Gly Ser Ala Met Ala 

885 890 895 

GGA TCT CTT GGT GGT TTC AAC GCG CAC GCA GCT AAT TTG GTC ACT GCA 2856 
Gly Ser Val Gly Gly Phe Asn Ala His Ala Ala Asn Leu Val Thr Ala 

900 905 910 

CTT TTC TTG GCA TTA GGC CAA GAT CCT GCG CAG AAC GTC GAA AGT TCC 2904 
Leu Phe Leu Ala Leu Gly Gin Asp Pro Ala Gin Asn Val Glu Ser Ser 
915 920 925 

45 AAC TGT ATA ACT TTG ATG AAG GAA GTT GAT CGT GAT TTA AGG ATC TCT 2952 
Asn Cys lie Thr Leu Met Lys Glu Val Asp Gly Asp Leu Arg lie Ser 
930 935 940 
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GTT TCC ATG CCA TCT ATT GAA GTT GGT ACC ATT GGC GGG GOT ACT GTT 
Val ser Met Pro Ser Ilo Glu Val Gly Thr lie Gly Gly Gly Thr Val 
945 950 955 960 

CTG GAG CCT CAG GGC GCC ATG CTT GAT CIT CTC GGC GTT CGT GGT CCT 

Leu Glu Pro Gin Gly Ala Hat Leu Asp Leu Leu Gly Val Arg Gly Pro 

965 970 975 

CAC CCC ACT GAA CCT GGA GCA AAT GCT AGG CAA TTA GCT AGA ATA ATC 

Hie Pro Thr Glu Pro Gly Ala Aan Ala Arg Gin Leu Ala Arg lie He 

980 985 990 

GCG TGT GCT GTC TTG GCT GGT GAA CTG TCT CTG TGC TCC GCA CTT GCT 
Ala Cys Ala Val Leu Ala Gly Glu Leu Ser Leu Cy» Ser Ala Leu Ala 
995 1000 1005 

GCC GGT CAC CTG GTA CAA AGC CAT ATG ACT CAC AAC CGT AAA ACA AAC 
Ala Gly His Leu Val Gin Ser His Met Thr Hie Asn Arg Lya Thr Asn 

1010 1015 1020 

AAA GCC AAT GAA CTG CCA CAA CCA AGT AAC AAA GGG CCC CCC TGT AAA 

Lya Ala Asn Glu Leu Pro Gin Pro Ser Asn Lys Gly Pro Pro Cys Lys 
1025 1030 1035 1040 

ACC TCA GCA TTA TTA TAACTCTTGT AGTTTACATG GTGATACTTT ATATCTTTGT 
Thr Ser Ala Leu Leu 

1045 

ATTGTCTAGC TATTCTAAAT CATCTGCATG TAATAAGAAG TTGATCAAAA TGA 

(2) INFORMATION FOR SEQ ID HO: 6: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1045 anino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION : SEQ ID NO: 6: 

Met Ser Leu Pro Leu Lys Thr lie Val His Leu Val Lys Pro Phe Ala 
1 5 10 15 

Cys Thr Ala Arg Phe Ser Ala Arg Tyr Pro He His Val He Val Val 

20 25 30 

Ala Val Leu Leu Ser Ala Ala Ala Tyr Leu Ser Val Thr Gin Ser Tyr 
35 40 45 
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10 



15 



20 



25 



30 



Leu Asn Glu 
50 



lie Lys 
65 



Lys Lou Asp 

55 



Asn Gin Tyr Ser Thr Tyr Leu Ser 

60 



Asp Glu Leu Phe Glu Lys Cys Thr His Tyr Tyr Arg ser 

70 75 80 



Pro Val Ser Asp Thr Trp Lys Leu Leu Ser 

85 90 



Lys Glu Ala Ala Asp 

95 



lie Tyr Thr Pro Pha His Tyr Tyr Leu Ser Thr lie 

100 105 



Phe Gin 
110 



Lys Asp Asn 
115 



Val Asp His Thr 
130 



Glu Leu Val 
145 



Thr Thr Leu Pro 

120 



Tyr Leu Leu 
135 



Leu Asp Asp Val lie Tyr 

125 



Glu Glu Pro Lys lie Pro Thr 
140 



Glu Asn Gly Thr Lys Trp Arg Leu Arg Asn Asn Ser 
150 155 160 



Asn Phe lie Leu Asp Leu His Asn lie Tyr Arg Asn Met Val Lys Gin 

165 170 175 



Phe Ser Asn Lys Thr 

180 



Glu Phe Asp Gin Phe Asp Leu Phe lie Hi 
185 190 



Leu Ala Ala Tyr Leu Thr Leu Phe Tyr Thr Leu Cys Cys Leu Phe Asn 
195 200 205 

Asp Met Arg Lys lie Gly Ser Lys Phe Trp Leu Ser Phe Ser Ala Leu 
210 215 220 



35 



Ser Asn Ser Ala Cys Ala Leu Tyr Leu Ser Leu Tyr Thr Thr His Ser 

225 230 235 240 



Leu Leu Lys Lys Pro Ala 

245 



Leu Leu Ser Leu Val lie Gly Leu Pro 
250 255 



40 



45 



Phe lie Val val lie lie Gly Phe Lys His Lys Val Arg Leu Ala Ala 

260 265 270 

Phe Ser Leu Gin Lys Phe His Arg lie Ser lie Asp Lys Lys He Thr 
275 280 285 

Val Ser Asn He He Tyr Glu Ala Met Phe Gin Glu Gly Ala Tyr Leu 
290 295 300 

He Arg Asp Tyr Leu Phe Tyr He Ser Ser Phe He Gly cys Ala He 

305 310 315 320 
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Tyr Ala Arg His Leu 

325 



Gly Leu Val Asn Phe Cya lie Leu Ser Thr 

330 335 



Phe Met Leu Val Phe Asp Leu Leu Leu 

340 345 



Ala Thr Phe Tyr Ser Ala 

350 



lie Leu Ser Met Lys Leu Glu lie Asn lie lie His Arg Ser Thr Val 
355 360 365 

He Arg Gin Thr Leu Glu Glu Asp Gly Val Val Pro Thr Thr Ala Asp 

370 375 380 



He He Tyr Lys Asp Glu Thr Ala 

385 390 



Glu Pro His Phe Leu 

395 



400 



Asn Val Ala He He Leu Gly Lys Ala Ser Val He Gly Leu Leu Leu 

405 410 415 

Leu He Asn Leu Tyr Val Phe Thr Asp Lys Leu Asn Ala Thr He Leu 

420 425 430 



Asn Thr Val Tyr Phe Asp 
435 



Thr He Tyr 
440 



Leu Pro Asn Phe He 
445 



Asn Tyr Lys Asp He Gly Asn Leu 

450 455 



Asn Gin Val He He Ser Val 
460 



Leu 
465 



Lys Gin Tyr Tyr Thr Pro Leu Lys Lys Tyr His Gin lie Glu 

470 475 480 



Asp Ser Val Leu Leu He He Asp Ser Val 

485 490 



Asn Ala He Arg Asp 

495 



Gin Phe He Ser Lys Leu Leu Phe Phe Ala Phe Ala Val Ser He Ser 

500 505 510 

He Asn Val Tyr Leu Leu Asn Ala Ala Lys He Hia Thr Gly Tyr Met 
515 520 525 



Asn Phe Gin Pro Gin 
530 



Asn Lys He Asp Asp Leu Val Val Gin Gin 
535 540 



Lys Ser Ala Thr He Glu Phe 
545 550 



Glu Thr Arg 
555 



Met Pro Ala Ser 

560 



Ser Gly Leu Glu Thr 

565 



Val Thr Ala Lys Asp He He He Ser Glu 

570 575 



Glu He Gin Asn Asn Glu Cys Val Tyr Ala Leu Ser 

580 585 



Ser Gin Asp Glu 
590 
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lie Arg 

595 



Pr Leu Ser 



Asn Leu Val Glu Leu Met Glu Lys Glu Gin 
600 605 



Leu Lys Asn Met Asn Asn Thr Glu Val 

610 615 



Asn Leu Val Val Asn Gly 
620 



Lys Leu Pro Leu Tyr Ser Leu Glu Lys Lys Leu Glu Asp Thr Thr Ara 
625 630 635 640 

Ala Val Leu Val Arg Arg Lys Ala Leu Ser Thr Leu Ala Glu Ser Pro 

645 650 655 

He Leu Val Ser Glu Lys Leu Pro Phe Arg Asn Tyr Asp Tyr Asp Arg 

660 665 670 

Val Phe Gly Ala Cys Cys Glu Asn Val He Gly Tyr Met Pro He Pro 

675 680 685 



Val Gly Val He Gly 
690 



Leu He He Asp Gly Thr Ser Tyr His He 
695 700 



Pro Met Ala Thr Thr Glu Gly Cys Leu Val Ala 
705 710 715 



Ala Met Arg Gly 

720 



Cys Lys Ala He Asn Ala Gly Gly Gly Ala Thr Thr Val Leu Thr Lys 

725 730 735 

Asp Gly Met Thr Arg Gly Pro Val Val Arg Phe Pro Thr Leu He Arg 

740 745 750 

Ser Gly Ala Cys Lys He Trp Leu Asp Ser Glu Glu Gly Gin Asn Ser 

755 760 765 

He Lys Lys Ala Phe Asn Ser Thr Ser Arg Phe Ala Arg Leu Gin His 

770 775 780 



He Gin Thr Cys Leu Ala Gly Asp Leu Leu Phe Met Arg Phe 

785 790 795 



Thr 
800 



Thr Thr Gly Asp Ala Met Gly Met Asn Met He Ser Lys Gly Val Glu 

805 810 815 

Tyr Ser Leu Lys Gin Met Val Glu Glu Tyr Gly Trp Glu Asp Met Glu 

820 825 830 



Val Val Ser Val 
835 



Gly Asn Tyr Cys Thr Asp Lys Lys Pro Ala Ala 
840 845 



He Asn Trp He Glu Gly Arg Gly Lys Ser Val Val Ala Glu Ala Thr 
850 855 860 



60 
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lift Pro Gly Asp Val Val Lys Ser Val Leu Lys Ser Asp Val Ser Ala 

5 865 870 875 880 

Lau Val Glu Leu Asn Ila Ser Lys Asn Leu Val Gly Ser Ala Met Ala 

885 890 895 

Gly Ser Val Gly Gly Phe Asn Ala His Ala Ala Asn Leu Val Thr Ala 
10 900 905 910 

Leu Phe Leu Ala Leu Gly Gin Asp Pro Ala Gin Asn Val Glu Ser Ser 

915 920 925 



15 



Asn Cya He Thr Leu Met Lys Glu Val Asp Gly Asp Leu Arg lie Ser 
930 935 940 

Val Ser Met Pro Ser He Glu Val Gly Thr He Gly Gly Gly Thr Val 

945 950 955 960 

20 Leu Glu Pro Gin Gly Ala Met Leu Asp Leu Leu Gly Val Arg Gly Pro 

965 970 975 

His Pro Thr Glu Pro Gly Ala Asn Ala Arg Gin Leu Ala Arg He lie 

980 985 990 



25 



30 



35 



40 



45 



Ala Cys Ala Val Leu Ala Gly Glu Leu Ser Leu Cys Ser Ala Leu Ala 
995 1000 1005 

Ala Gly His Leu Val Gin Ser His Met Thr His Asn Arg Lys Thr Asn 
1010 1015 1020 

Lys Ala Asn Glu Leu Pro Gin Pro Ser Asn Lys Gly Pro Pro Cys Lys 

1025 1030 1035 1040 

Thr Ser Ala Leu Leu 

1045 

(2) INFORMATION FOR SEQ ID HO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) KOLECUIi TYPE: DNA (genomic) 



(Xi) SEQUENCE DESCRIPTION : SEQ ID NO: 7: 
GATCCGTCGA CGCATGCCTG CA 



so 
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(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 14 base pairs 

(B) type: nucleic acid 

(C) STRAND EDNZSS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: OKA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
GGCATGCGTC GACG 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS I 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRAKDEDHESS t single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genoaic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
CCGGATCCGG 

(2) INFORMATION FOR SEQ ID NO: 10: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS 3 single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10 

AGCTTTCGCG AGCTCGAGAT CTAGATATCG ATG 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 32 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEONESS: single 
(O) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genoaic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID HO: 11: 
AATTCATCGA TATCTAGATC TCGAGCTCGC GA 
(2) INFORMATION FOR SEQ ID HO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRAHDBDHESS: » ingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
TATCGAATTC AAGCTTGGTA CCGA 
(2) INFORMATION FOR SEQ ID HO: 13 1 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRAHDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genoaic) 



(Xi) SEQUENCE DESCRIPTION; SEQ ID NO: 13: 
TATCGGTACC AAGCTTGAAT TCGA 
(2) INFORMATION FOR SEQ ID NO: 14: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(11) MOLECULE TYPES DMA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
GATCCAGCTC TGTAC 

(2) INFORMATION FOR SEQ ID NO: 15: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH f 17 base pair* 

(B) TYPE: nucleic acid 

(C) STRANOSDNESS : single 

(D) TOFOX0GY: linear 

(11) MOLECULE TYPE: DNA (genomic) 



(xl) SEQUENCE DESCRIPTION I SEQ ID NO 1 15: 
CCCGGGATCG ATCACGT 
(2) INFORMATION FOR SEQ ID NO: 16 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDESNBSS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
GATCGATCCC GGGACGT 
(2) INFORMATION FOR SEQ ID NO: 17: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: mingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
ATAAAGACAT TGTTTTTAGA TCTCTTCTAA 
(2) INFORMATION FOR SEQ ID NO: 18: 

(1) SEQUENCE CHARACTERISTICS : 

(A) LENGTH I 32 base pair* 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY! linear 

(11) MOLECULE TYPE: DNA (genomic) 



(Xi) SEQUENCE DESCRIPTIONS SEQ ID NO: 18: 
GATTTATCTT CCTTTCCTCC AAGTTTTTCT TC 
(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
AGCTTCGAAG AACGAAGGAA GGAGCACACA CTTAG 
(2) INFORMATION FOR SEQ ID NO: 20: 

(1) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA ( genomic ) 



(Xi) SEQUENCE DESCRIPTION: SEQ 10 NO: 20: 
ATTGGTATAT ATACGCATAT TGCGGCCGCG GTAC 
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(2) INFORMATION FOR SEQ ID KO:21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH i 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONESS: Single 
(0) TOPOLOGY t linear 

(ii) MOLECULE TYPE: DHA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
CGCGGCCGCA ATATGCGTAT ATATAC 
(2) INFORMATION FOR SEQ ID NO:22l 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(Xi) SEQUENCE DESCRIPTION 2 SEQ ID NO: 22: 

CAATCTAAGT CTGTGCTCCT TCV1TCGTTC TTCGA 

(2) INFORMATION FOR SEQ ID NO:23: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 30 base pairs 
(8) TYPE: nucleic acid 

(C) STRANDEONESS i single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

CTTTATGAGG GTAACATGAA TTCAAGAAGG 

(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 30 base pairs 
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10 



(B) TYPE: nucleic acid 

(C) STRAMDEDHESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DUX (genoulc) 



(xi) SEQUENCE DESCRIPTION: SEQ ID HO: 24: 
GCCAAGTAGT TTTTACTCTT CAAGACACAT AATTTGCTGA CA 42 



16 

Claims 

1. A method of increasing squalene, zymosterol, cholesta-7,24~dlenol and cholesta-5,7,24-trienol accumu- 
lation in yeast comprising increasing the expression level of a structural gene encoding a polypeptide hav- 

20 ing HMG-CoA reductase activity in a mutant yeast having defects in the expression of 

zymc*terot-24~methyltransferase and ergosta-5,7,24(28)-trienol-22-dehydrogenase. 

2. The method according to claim 1 wherein said encoded polypeptide is an active, truncated HMG-CoA 
reductase enzyme. 

25 

3. The method according to daim 1 wherein said polypeptide is an active, truncated HMG-CoA reductase 
enzyme comprising the catalytic and at least a portion of the linker region but is free from the membrane 
binding region of S. cerevisiae HMG-CoA reductase #1. 

30 4. The method according to claim 1 wherein said structural gene encodes an active, truncated HMG-CoA 
reductase enzyme comprising the catalytic and at least a portion of the linker region that is free from the 
membrane binding region of an HMG-CoA reductase enzyme. 

5. The method according to claim 1 wherein the yeast is of the species S. cerevisiae . 

35 

6. The method according to data 1 wherein squalene is accumulated relative to said zymosterol, cholesta- 
7,24-diend and cholesta-5 ,7,24-trienol by culturing said yeast under conditions of restricted aeration. 

7. The method according to claim 1 wherein the expression level is increased by Increasing the copy number 
40 of a structural gene encoding a polypeptide having HMG-CoA reductase activity. 

8. The method according to claim 7 wherein the copy number is increased by transforming said yeast with 
a recombinant DNA molecule comprising a vector operatively linked to an exogenous DNA segment that 
encodes a polypeptide having HMG-CoA reductase activity, and a promoter suitable for driving the exp- 

45 ression of said polypeptide in said yeast 

9. The method according to claim 8 wherein the promoter is selected from the group consisting of the GAL 
1, GAL 10, GAL 1-10, PGK and ADH promoters. 



so 10. The method according to claim 8 wherein the promoter and the exogenous DNA segment are integrated 
Into the chromosomal DNA of said yeast 

11. A method of increasing squalene, ergosta-8,22-dienol, ergosta-7,22 dienol, ergosta-8-enol and ergosta- 
7-enol accumulation In yeast of the species S. cerevisiae comprising transforming a mutant S. cerevisiae 
55 having a delect in the expression f episterol-5-dehydrogenase with a recombinant DNA m lecuie com- 

prising a vector perativ ly linked to an exogen us DNA s gment that encod s the catalytic region and 
at least a portion of the linker region but is free from the m mbrane binding region of an HMG-CoA reduc- 
tase enzyme, and a promoter suitable for driving the expression of said reductase in said yeast 
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12. A m thod of increasing squalene, zymosterol and cholesta-7 t 24-dienoJ accumulati n in yeast of th 
sp cies S. cevevialae comprising transforming a mutant S. ceveviaiae having a defect in th expression 
of zyrTxwtenoi-24-methyl transferase and episteroi-5-dehydrogenas with a recombinant DNA m lecul 
comprising a vector op rativety linked to an xogenous DNA segment that ncodes th catalytic regi n 
and at least a portion of the linker region but is free from the membrane binding region of an HMG-CoA 
reductase enzyme, and a promoter suitable for driving the expression of said reductase In said yeast 

13. A method of increasing squalene, zymosterol, ergosta-5,7,24(28)-trienol and ergosta-5,7-dienol accumu- 
lation in yeast of the species S. cerevisiae comprising transforming a mutant S. cerevtelae having a defect 
in the expression of ergosta-5\7^4(28)-trienol-22-dehydrogenase with a recombinant DNA molecule com- 
prising a vector operativery linked to an exogenous DNA segment that encodes the catalytic region and 
at feast a portion of the linker region but is free from the membrane binding region of an HMG-CoA reduc- 
tase enzyme, and a promoter suitable for driving the expression of said reductase In said yeast 

14. The method according to claim 1 1, 12 or 13 wherein the recombinant DNA molecule is selected from the 
group of plasmid vectors consisting of plasmids pSOC725ARC, pSOC106ARC, pARC300D, pARC306E, 
PARC300S, DARC300T and OARC304S. 

15. A mutant S. cerevisiae having defects in the expression of zymosteroI-24-methyI transferase and ergos- 
ta-5 t 7 v 24(28)-trienol-22-dehydrogenase enzymes, which mutant species is designated ATC0402mu. 

16. A mutant of S. cevevisiae having single or double defects in the expression of enzymes that catalyze the 
conversion of squalene to ergosterol trartsfbrrned with a recombinant DNA molecule comprising a vector 
operatively linked to an exogenous DNA segment that encodes the catalytic region and at least a portion 
of the linker region but is free from the membrane binding region of an HMG-CoA reductase enzyme, and 
a promoter suitable for driving the expression of said reductase in said yeast 

17. The mutant according to claim 16 wherein the mutant is selected from the group consisting of mutants 
ATC0315rc, ATC1500. ATC1502, ATC1503, ATC1551, ATC2100, ATC2104, ACT2107, ACT2108. 
ATC2109andATC2401. 

18. A recombinant DNA molecule selected from the group of plasmids designated plasmids pARC304S, 
PARC300S, pARC300T, pARC300D, pARC306E, pSOC106ARC and pSOC725ARC. 
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